<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43728?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/43728?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35571/medusa-a-multi-draft-based-scaffolder</guid>
	<pubDate>Wed, 14 Feb 2018 02:49:00 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35571/medusa-a-multi-draft-based-scaffolder</link>
	<title><![CDATA[MeDuSa: a multi-draft based scaffolder]]></title>
	<description><![CDATA[<p><span>MeDuSa (Multi-Draft based Scaffolder), an algorithm for genome scaffolding. MeDuSa exploits information obtained from a set of (draft or closed) genomes from related organisms to determine the correct order and orientation of the contigs. MeDuSa formalises the scaffolding problem by means of a combinatorial optimisation formulation on graphs and implements an efficient constant factor approximation algorithm to solve it. In contrast to currently used scaffolders, it does not require either prior knowledge on the microrganisms dataset under analysis (e.g. their phylogenetic relationships) or the availability of paired end read libraries.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/combogenomics/medusa" rel="nofollow">https://github.com/combogenomics/medusa</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36723/hapsembler-an-assembler-for-highly-polymorphic-genomes</guid>
	<pubDate>Tue, 22 May 2018 04:09:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36723/hapsembler-an-assembler-for-highly-polymorphic-genomes</link>
	<title><![CDATA[Hapsembler: An Assembler for Highly Polymorphic Genomes]]></title>
	<description><![CDATA[Hapsembler is a haplotype-specific genome assembly toolkit that is designed for genomes that are rich in SNPs and other types of polymorphism. Hapsembler can be used to assemble reads from a variety of platforms including Illumina and Roche/454. 

http://compbio.cs.toronto.edu/hapsembler/<p>Address of the bookmark: <a href="http://compbio.cs.toronto.edu/hapsembler/" rel="nofollow">http://compbio.cs.toronto.edu/hapsembler/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40940/consed-a-finishing-package-bam-file-viewer-assembly-editor-autofinish-autoreport-autoedit-and-align-reads-to-reference-sequence</guid>
	<pubDate>Fri, 07 Feb 2020 07:16:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40940/consed-a-finishing-package-bam-file-viewer-assembly-editor-autofinish-autoreport-autoedit-and-align-reads-to-reference-sequence</link>
	<title><![CDATA[Consed--A Finishing Package (BAM File Viewer, Assembly Editor, Autofinish, Autoreport, Autoedit, and Align Reads To Reference Sequence)]]></title>
	<description><![CDATA[<ul>
<li>Supports Illumina, 454, other Next-Gen and Sanger Reads and allows mixtures of these read types</li>
<li>Consed includes BamScape which can view bam files with unlimited numbers of reads. BamScape can bring up consed to edit reads and the reference sequence in targeted regions.</li>
<li>Consed is compatible with Newbler, Cross_match, Phrap, MIRA, Velvet and PCAP output.</li>
<li>Quickly takes the user to each variant site for viewing (also available as an automated report)</li>
<li>Overview of assembly can help detect and fix misassemblies</li>
<li>Editing time reduced by the program's ability to pin-point problem areas</li>
<li>Editing is guided by error probabilities</li>
</ul><p>Address of the bookmark: <a href="http://www.phrap.org/consed/consed.html" rel="nofollow">http://www.phrap.org/consed/consed.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44342/ncbi-datasets%E2%80%AFpages</guid>
	<pubDate>Wed, 12 Jul 2023 06:29:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44342/ncbi-datasets%E2%80%AFpages</link>
	<title><![CDATA[NCBI Datasets pages]]></title>
	<description><![CDATA[<p>Update! Assembly and Genome record pages now redirect to new NCBI Datasets pages. NCBI Datasets is a new resource that makes it easier to find and download genome data. Learn more: https://ncbiinsights.ncbi.nlm.nih.gov/2023/07/11/ncbi-datasets-genome-assembly-pages/&nbsp;<a href="https://ow.ly/GU3o50P8QH4"></a><a href="https://www.linkedin.com/feed/hashtag/?keywords=ncbicgr&amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7084592728260386816">#NCBICGR</a></p><p><span>Effective July 10, 2023, NCBI&rsquo;s Assembly and Genome record pages now redirect to&nbsp;</span>new<a href="https://www.ncbi.nlm.nih.gov/datasets/?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=datasets-genome-assembly-redirect-20230711"> NCBI Datasets </a><span>pages. As&nbsp;</span><a href="https://ncbiinsights.ncbi.nlm.nih.gov/2023/03/07/ncbi-datasets-genome-taxonomy-pages/?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=datasets-genome-assembly-redirect-20230711">previously announced</a><span>, these updates are part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data.  </span><span>&nbsp;</span></p><h5>The following pages have been updated:</h5><ul>
<li><span>The NCBI Assembly record pages now redirect to the new </span><a href="https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_023065955.2/?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=datasets-genome-assembly-redirect-20230711"><span>NCBI Datasets</span><strong><span> </span></strong><span>Genome</span></a><span> </span><span>record pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST. </span><span>&nbsp;</span></li>
<li><span>The NCBI</span><strong> </strong><span>Genome record pages now redirect to the </span><a href="https://www.ncbi.nlm.nih.gov/datasets/taxonomy/9644/?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=datasets-genome-assembly-redirect-20230711"><span>NCBI Datasets</span><strong><span> </span></strong><span>Taxonomy</span></a><span> </span><span>record pages that provide a taxonomy-focused portal to genes, genomes, and additional NCBI resources.  </span><span>&nbsp;</span></li>
</ul><p><span>During this transition, you will have the option to return to the legacy Genome and Assembly record pages. We will remove the legacy pages in early 2024. </span><span>&nbsp;</span></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36516/metassembler-merging-and-optimizing-de-novo-genome-assemblies</guid>
	<pubDate>Tue, 08 May 2018 04:52:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36516/metassembler-merging-and-optimizing-de-novo-genome-assemblies</link>
	<title><![CDATA[Metassembler: merging and optimizing de novo genome assemblies]]></title>
	<description><![CDATA[<p><span>Metassembler combines multiple whole genome de novo assemblies into a combined consensus assembly using the best segments of the individual assemblies.</span></p>
<p><span><span>Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence.&nbsp;</span></span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/metassembler/?source=directory" rel="nofollow">https://sourceforge.net/projects/metassembler/?source=directory</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40465/airlift-a-methodology-and-tool-for-comprehensively-moving-mappings-and-annotations-from-one-genome-to-another-similar-genome</guid>
	<pubDate>Mon, 23 Dec 2019 10:20:13 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40465/airlift-a-methodology-and-tool-for-comprehensively-moving-mappings-and-annotations-from-one-genome-to-another-similar-genome</link>
	<title><![CDATA[AirLift, a methodology and tool for comprehensively moving mappings and annotations from one genome to another similar genome]]></title>
	<description><![CDATA[<p>We propose AirLift, a methodology and tool for comprehensively moving mappings and annotations from one genome to another similar genome while maintaining the accuracy of a full mapper.</p><p>Address of the bookmark: <a href="https://github.com/CMU-SAFARI/AirLift" rel="nofollow">https://github.com/CMU-SAFARI/AirLift</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</guid>
	<pubDate>Sat, 11 Sep 2021 00:28:14 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</link>
	<title><![CDATA[RagTag: a collection of software tools for scaffolding and improving modern genome assemblies]]></title>
	<description><![CDATA[<p>RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. Tasks include:</p>
<ul>
<li>Homology-based misassembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/correct">correction</a></li>
<li>Homology-based assembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/scaffold">scaffolding</a>&nbsp;and&nbsp;<a href="https://github.com/malonge/RagTag/wiki/patch">patching</a></li>
<li>Scaffold&nbsp;<a href="https://github.com/malonge/RagTag/wiki/merge">merging</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/malonge/RagTag" rel="nofollow">https://github.com/malonge/RagTag</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</guid>
	<pubDate>Mon, 10 Oct 2016 08:56:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</link>
	<title><![CDATA[PHYMMBL]]></title>
	<description><![CDATA[<p><span>Metagenomics sequencing projects collect samples of DNA from uncharacterized environments that may contain hundreds or even thousands of species. One of the main challenges in analyzing a metagenome is phylogenetic classification of raw sequence reads into groups representing the same or similar species. Such classification is a useful prerequisite for genome assembly and for analysis of the biological diversity present in a sample. The newest sequencing technologies have simultaneously made metagenomics easier, by making the sequencing process faster, and more difficult, by producing shorter read lengths than previous technologies. Methods for classifying sequences as short as 100 base pairs (bp) have until now been relatively inaccurate, requiring metagenomics projects to use older, long-read technologies.&nbsp;</span><strong>Phymm</strong><span>, a new classification approach for metagenomics data which uses interpolated Markov models (IMMs) to taxonomically classify DNA sequences, can accurately classify reads as short as 100 bp. Its accuracy for short reads represents a significant leap forward over previous composition-based classification methods.&nbsp;</span><strong>PhymmBL</strong><span>&nbsp;(rhymes with "thimble"), the hybrid classifier included in this distribution which combines analysis from both Phymm and&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/BLAST">BLAST</a><span>, produces even higher accuracy.</span></p><p>Address of the bookmark: <a href="http://www.cbcb.umd.edu/software/phymm/" rel="nofollow">http://www.cbcb.umd.edu/software/phymm/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43670/useful-bioinformatics-analysis-tools</guid>
	<pubDate>Thu, 23 Dec 2021 23:10:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43670/useful-bioinformatics-analysis-tools</link>
	<title><![CDATA[Useful Bioinformatics Analysis Tools !]]></title>
	<description><![CDATA[<h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=cometa&amp;subpage=about">CoMeta</a></h3><p><strong>Classificier of reads from metagenomic sequencing experiments.</strong></p><p><span>&bull;&nbsp;&nbsp;Kawulok, J., Deorowicz, S.,&nbsp;</span><em>CoMeta: Classification of Metagenomes Using k-mers</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0121453">PLOS ONE,&nbsp;</a><span>2015; 10(4):1&ndash;23,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=CoMSA&amp;subpage=about">CoMSA</a></h3><p><strong>Compressor of multiple sequence alignments of proteins.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Walczyszyn, J., Debudaj-Grabysz, A.,&nbsp;</span><em>CoMSA: compression of protein multiple sequence alignment files</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty619">Bioinformatics,&nbsp;</a><span>2019; 35(2):22&ndash;234,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=dsrc&amp;subpage=about">DSRC</a></h3><p><strong>Compressor of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Roguski, L., Deorowicz, S.,&nbsp;</span><em>DSRC 2: Industry-oriented compression of FASTQ files</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/30/15/2213">Bioinformatics,&nbsp;</a><span>2014; 30(15):2213&ndash;2215,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Compression of DNA sequences in FASTQ format</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/">Bioinformatics,&nbsp;</a><span>2011; 27(6):860&ndash;862,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=famsa&amp;subpage=about">FAMSA</a></h3><p><strong>Multiple sequence alignment designed for huge families of proteins (even containing hundreds of thousands of sequences).</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A.,&nbsp;</span><em>FAMSA: Fast and accurate multiple sequence alignment of huge protein families</em><span>,&nbsp;</span><a href="http://www.nature.com/articles/srep33964">Scientific Reports,&nbsp;</a><span>2016; 6(33964):</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=fastore&amp;subpage=about">FaStore</a></h3><p><strong>Compressor of FASTQ files.</strong></p><p><span>&bull;&nbsp;&nbsp;Roguski, L., Ochoa, I., Hernaez, M., Deorowicz, S.,&nbsp;</span><em>FaStore - a space-saving solution for raw sequencing data</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty205">Bioinformatics,&nbsp;</a><span>2018; 34(16):2748&ndash;2756,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=fqsqueezer&amp;subpage=about">FQSqueezer</a></h3><p><strong>Experimental high-end compressor of FASTQ files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S.,&nbsp;</span><em>FQSqueezer: k-mer-based compression of sequencing data</em><span>,&nbsp;</span><a href="https://www.nature.com/articles/s41598-020-57452-6">Scientific Reports,&nbsp;</a><span>2020; 10(578):</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gdc&amp;subpage=about">GDC</a></h3><p><strong>Compressor of collections of genome sequences.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A., Niemiec, M.,&nbsp;</span><em>GDC 2: Compression of large collections of genomes</em><span>,&nbsp;</span><a href="http://www.nature.com/srep/2015/150625/srep11565/full/srep11565.html">Scientific Reports,&nbsp;</a><span>2015; 5(11565):1&ndash;12,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Robust relative compression of genomes with random access</em><span>,&nbsp;</span><a href="http://sun.aei.polsl.pl/REFRESH/bioinformatics.oxfordjournals.org/content/27/21/2979.abstract">Bioinformatics,&nbsp;</a><span>2011; 27(21):2979&ndash;2986,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gtc&amp;subpage=about">GTC</a></h3><p><strong>Genotype databases compressor with support for fast queries.</strong></p><p><span>&bull;&nbsp;&nbsp;Danek, A., Deorowicz, S.,&nbsp;</span><em>GTC: how to maintain huge genotype collections in a compressed form</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty023">Bioinformatics,&nbsp;</a><span>2018; 34(11):1834&ndash;1840,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gtshark&amp;subpage=about">GTShark</a></h3><p><strong>Genotypes compressor.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A.,&nbsp;</span><em>GTShark: Genotype compression in large projects</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btz508">Bioinformatics,&nbsp;</a><span>2019; 35(22):4791&ndash;4793,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=kmc&amp;subpage=about">KMC</a></h3><p><strong>Memory frugal&nbsp;<em>k</em>-mer counter.</strong></p><p><span>&bull;&nbsp;&nbsp;Kokot, M., Długosz, M., Deorowicz, S.,&nbsp;</span><em>KMC 3: counting and manipulating k -mer statistics</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btx304">Bioinformatics,&nbsp;</a><span>2017; 33(17):2759&ndash;2761,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Kokot, M., Grabowski, Sz., Debudaj-Grabysz, A.,&nbsp;</span><em>KMC 2: Fast and resource-frugal k-mer counting</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btv022">Bioinformatics,&nbsp;</a><span>2015; 31(10):1569&ndash;1576,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Grabowski, Sz.,&nbsp;</span><em>Disk-based k-mer counting on a PC</em><span>,&nbsp;</span><a href="http://www.biomedcentral.com/1471-2105/14/160">BMC Bioinformatics,&nbsp;</a><span>2013; 14():Article no. 160,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=kmer-db&amp;subpage=about">Kmer-db</a></h3><p><strong>Tool for estimation of evolutionary distances in a collection of genomes.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Gudys, A., Dlugosz, M., Kokot, M., Danek, A.,&nbsp;</span><em>Kmer-db: instant evolutionary distance estimation</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty610">Bioinformatics,&nbsp;</a><span>2019; 35(1):133&ndash;136,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=mugi&amp;subpage=about">MuGI</a></h3><p><strong>Index allowing queries for a collection of multiple genome sequences.</strong></p><p><span>&bull;&nbsp;&nbsp;Danek, A., Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Indexes of Large Genome Collections on a PC</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0109384">PLOS ONE,&nbsp;</a><span>2014; 9(10):e109384,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=orcom&amp;subpage=about">ORCOM</a></h3><p><strong>Experimental compressor of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Grabowski, Sz., Deorowicz, S., Roguski, L.,&nbsp;</span><em>Disk-based compression of data from genome sequencing</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/early/2014/12/22/bioinformatics.btu844.abstract">Bioinformatics,&nbsp;</a><span>2014; 31(9):1389&ndash;1395,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=pgsa&amp;subpage=about">PgSA</a></h3><p><strong>Index allowing queries for a collection of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Kowalski, T., Grabowski, Sz., Deorowicz, S.,&nbsp;</span><em>Indexing arbitrary-length k-mers in sequencing reads</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0133198">PLOS ONE,&nbsp;</a><span>2015; 10(7):1&ndash;16,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=quickprobs&amp;subpage=about">QuickProbs</a></h3><p><strong>Multiple sequence alignment designed especially for GPU.</strong></p><p><span>&bull;&nbsp;&nbsp;Gudys, A., Deorowicz, S.,&nbsp;</span><em>QuickProbs 2: towards rapid construction of high-quality alignments of large protein families</em><span>,&nbsp;</span><a href="http://www.nature.com/articles/srep41553">Scientific Reports,&nbsp;</a><span>2017; 7(41553):</span><br /><span>&bull;&nbsp;&nbsp;Gudys, A., Deorowicz, S.,&nbsp;</span><em>QuickProbs &ndash; A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors</em><span>,&nbsp;</span><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0088901">PLOS ONE,&nbsp;</a><span>2014; 9(2):e88901,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=reckoner&amp;subpage=about">RECKONER</a></h3><p><strong>Read error corrector.</strong></p><p><span>&bull;&nbsp;&nbsp;Maciej Długosz, M., Deorowicz, S.,&nbsp;</span><em>RECKONER: read error corrector based on KMC</em><span>,&nbsp;</span><a href="https://academic.oup.com/bioinformatics/article-abstract/33/7/1086/2843893/RECKONER-read-error-corrector-based-on-KMC">Bioinformatics,&nbsp;</a><span>2017; 33(7):1086&ndash;1089,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=tgc&amp;subpage=about">TGC</a></h3><p><strong>Compressor of collections of genomes given in Variant Call Format (VCF) files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A., Grabowski, Sz.,&nbsp;</span><em>Genome compression: a novel approach for large collections</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/early/2013/08/29/bioinformatics.btt460">Bioinformatics,&nbsp;</a><span>2013; 29(20):2572&ndash;2578,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=vcfshark&amp;subpage=about">VCFShark</a></h3><p><strong>Compressor of VCF files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A.,&nbsp;</span><em>GTShark: Genotype compression in large projects</em><span>,&nbsp;</span><a href="https://www.biorxiv.org/content/10.1101/2020.12.18.423437v1">biorxiv.org,&nbsp;</a><span>2020; ():</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=whisper&amp;subpage=about">Whisper</a></h3><p><strong>Experimental mapper of whole genome sequencing data.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Gudys, A.,&nbsp;</span><em>Whisper 2: indel-sensitive short read mapping</em><span>,&nbsp;</span><a href="https://doi.org/10.1101/2019.12.18.881292">bioRxiv.org,&nbsp;</a><span>2019; :</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz.,&nbsp;</span><em>Whisper: read sorting allows robust robust mapping of DNA sequencing data</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty927">Bioinformatics,&nbsp;</a><span>2019; 35(12):2043&ndash;2050,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz.,&nbsp;</span><em>Robust mapping of whole genome sequencing data</em><span>,&nbsp;</span><a href="https://meetings.cshl.edu/abstracts.aspx?meet=GENOME&amp;year=17">Poster at The Biology of Genomes Conference,&nbsp;</a><span>2017;</span></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>