<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37554?offset=10</link>
	<atom:link href="https://bioinformaticsonline.com/related/37554?offset=10" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</guid>
	<pubDate>Fri, 19 Oct 2018 07:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</link>
	<title><![CDATA[BASE: a practical de novo assembler for large genomes using long NGS reads]]></title>
	<description><![CDATA[<p><span>new&nbsp;</span><em>de novo</em><span>&nbsp;assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.</span></p><p>Address of the bookmark: <a href="https://github.com/dhlbh/BASE" rel="nofollow">https://github.com/dhlbh/BASE</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41669/filtlong-quality-filtering-tool-for-long-reads</guid>
	<pubDate>Wed, 13 May 2020 10:23:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41669/filtlong-quality-filtering-tool-for-long-reads</link>
	<title><![CDATA[Filtlong: quality filtering tool for long reads]]></title>
	<description><![CDATA[<p>Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.</p>
<p>Filtlong builds into a stand-alone executable:</p>
<pre><code>git clone https://github.com/rrwick/Filtlong.git
cd Filtlong
make -j
bin/filtlong -h
</code></pre><p>Address of the bookmark: <a href="https://github.com/rrwick/Filtlong" rel="nofollow">https://github.com/rrwick/Filtlong</a></p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37563/colormap-correcting-long-reads-by-mapping-short-reads</guid>
	<pubDate>Mon, 20 Aug 2018 14:17:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37563/colormap-correcting-long-reads-by-mapping-short-reads</link>
	<title><![CDATA[CoLoRMap: Correcting Long Reads by Mapping short reads]]></title>
	<description><![CDATA[<p><span>Second generation sequencing technologies paved the way to an exceptional increase in the number of sequenced genomes, both prokaryotic and eukaryotic. However, short reads are difficult to assemble and often lead to highly fragmented assemblies. The recent developments in long reads sequencing methods offer a promising way to address this issue. However, so far long reads are characterized by a high error rate, and assembling from long reads require a high depth of coverage. This motivates the development of hybrid approaches that leverage the high quality of short reads to correct errors in long reads.We introduce CoLoRMap, a hybrid method for correcting noisy long reads, such as the ones produced by PacBio sequencing technology, using high-quality Illumina paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the edit score to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results on bacterial, fungal and insect data sets show that CoLoRMap compares well with existing hybrid correction methods.The source code of CoLoRMap is freely available for non-commercial use at https://github.com/sfu-compbio/colormap</span></p>
<p><span>ehaghshe@sfu.ca or cedric.chauve@sfu.ca</span></p><p>Address of the bookmark: <a href="https://github.com/sfu-compbio/colormap" rel="nofollow">https://github.com/sfu-compbio/colormap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</guid>
	<pubDate>Tue, 12 Dec 2017 17:23:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</link>
	<title><![CDATA[MashMap: a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s)]]></title>
	<description><![CDATA[<p><span>MashMap is a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s). It maps a query sequence against a reference region if and only if its estimated alignment identity is above a specified threshold. It does not compute the alignments explicitly, but rather estimates a&nbsp;</span><em>k</em><span>-mer based&nbsp;</span><a href="https://en.wikipedia.org/wiki/Jaccard_index">Jaccard similarity</a><span>&nbsp;using a combination of&nbsp;</span><a href="http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p76-schleimer.pdf">Winnowing</a><span>&nbsp;and&nbsp;</span><a href="https://en.wikipedia.org/wiki/MinHash">MinHash</a><span>. This is then converted to an estimate of sequence identity using the&nbsp;</span><a href="http://mash.readthedocs.org/">Mash</a><span>&nbsp;distance. An appropriate&nbsp;</span><em>k</em><span>-mer sampling rate is automatically determined given minimum local alignment length and identity thresholds. The efficiency of the algorithm improves as both of these thresholds are increased.</span></p><p>Address of the bookmark: <a href="https://github.com/marbl/MashMap" rel="nofollow">https://github.com/marbl/MashMap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</guid>
	<pubDate>Fri, 11 May 2018 05:07:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads]]></title>
	<description><![CDATA[<p>MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads. MECAT employs novel alignment and error correction algorithms that are much more efficient than the state of art of aligners and error correction tools. MECAT can be used for effectively de novo assemblying large genomes. For example, on a 32-thread computer with 2.0 GHz CPU , MECAT takes 9.5 days to assemble a human genome based on 54x SMRT data, which is 40 times faster than the current&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>. MECAT performance were compared with&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>,&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>&nbsp;and&nbsp;<a href="http://canu.readthedocs.io/en/latest/">Canu(v1.3)</a>&nbsp;in five real datasets. The quality of assembled contigs produced by MECAT is the same or better than that of the&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>&nbsp;and&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>.&nbsp;</p>
<p>https://www.nature.com/articles/nmeth.4432</p><p>Address of the bookmark: <a href="https://github.com/xiaochuanle/MECAT" rel="nofollow">https://github.com/xiaochuanle/MECAT</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</guid>
	<pubDate>Tue, 05 Jun 2018 10:10:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</link>
	<title><![CDATA[Cerulean: A hybrid assembly using high throughput short and long reads]]></title>
	<description><![CDATA[Cerulean extends contigs assembled using short read datasets like Illumina paired-end reads using long reads like PacBio RS long reads.

Cerulean v0.1 has been implemented with bacterial genomes in mind.

The method is fully described in Deshpande, V., Fung, E. D., Pham, S., &amp; Bafna, V. (2013). Cerulean: A hybrid assembly using high throughput short and long reads. arXiv preprint arXiv:1307.7933.
http://arxiv.org/abs/1307.7933<p>Address of the bookmark: <a href="https://sourceforge.net/projects/ceruleanassembler/" rel="nofollow">https://sourceforge.net/projects/ceruleanassembler/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/38886/evaluation-of-genome-assembly-software-based-on-long-reads</guid>
	<pubDate>Fri, 01 Feb 2019 11:55:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/38886/evaluation-of-genome-assembly-software-based-on-long-reads</link>
	<title><![CDATA[Evaluation of genome assembly software based on long reads]]></title>
	<description><![CDATA[<p>TGS technologies have been used to produce highly accurate de novo assemblies of hundreds of microbial genomes and highly contiguous reconstructions of many dozens of plant and animal genomes, enabling new insights into evolution and sequence diversity. They have also been applied to resequencing analyses, to create detailed maps of structural variations in many species. Also, these new technologies have been used to fill in many of the gaps in the human reference genome.</p><p>In this report, we compare and evaluate several genome assembly software based on TSG technology. The experimentation has been performed on 4 reference genomes and the results evaluated with the QUAST software. The 11 software that have been evaluated are: Celera Assembler , Falcon , Miniasm, Newbler , SGA Assembler, Smartdenovo, Abruijn, Ra, DBG2OLC, Spades and Cerulean. The first 8 software use only long reads, while the 3 last software can merge long and short reads</p>]]></description>
	<dc:creator>BioStar</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/38886" length="382699" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36476/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</guid>
	<pubDate>Fri, 04 May 2018 19:16:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36476/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[Flye: Fast and accurate de novo assembler for single molecule sequencing reads]]></title>
	<description><![CDATA[<p><span>Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies. The algorithm uses an A-Bruijn graph to find the overlaps between reads and does not require them to be error-corrected. After the initial assembly, Flye performs an extra repeat classification and analysis step to improve the structural accuracy of the resulting sequence. The package also includes a polisher module, which produces the final assembly of high nucleotide-level quality.</span></p><p>Address of the bookmark: <a href="https://github.com/fenderglass/Flye" rel="nofollow">https://github.com/fenderglass/Flye</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37962/wtdbg2-a-de-novo-sequence-assembler-for-long-noisy-reads-produced-by-pacbio-or-oxford-nanopore</guid>
	<pubDate>Fri, 19 Oct 2018 08:48:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37962/wtdbg2-a-de-novo-sequence-assembler-for-long-noisy-reads-produced-by-pacbio-or-oxford-nanopore</link>
	<title><![CDATA[Wtdbg2: a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore]]></title>
	<description><![CDATA[<p><span>Wtdbg2 is a&nbsp;</span><em>de novo</em><span>&nbsp;sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). It assembles raw reads without error correction and then builds the consensus from intermediate assembly output. Wtdbg2 is able to assemble the human and even the 32Gb&nbsp;</span><a href="https://www.nature.com/articles/nature25458">Axolotl</a><span>&nbsp;genome at a speed tens of times faster than&nbsp;</span><a href="https://github.com/marbl/canu">CANU</a><span>&nbsp;and&nbsp;</span><a href="https://github.com/PacificBiosciences/FALCON">FALCON</a><span>while producing contigs of comparable base accuracy.</span></p><p>Address of the bookmark: <a href="https://github.com/ruanjue/wtdbg2" rel="nofollow">https://github.com/ruanjue/wtdbg2</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37576/lrcstats-a-tool-for-evaluating-long-reads-correction-methods</guid>
	<pubDate>Wed, 22 Aug 2018 11:05:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37576/lrcstats-a-tool-for-evaluating-long-reads-correction-methods</link>
	<title><![CDATA[LRCstats: a tool for evaluating long reads correction methods]]></title>
	<description><![CDATA[<p><span>LRCstats is an open-source pipeline for benchmarking DNA long read correction algorithms for long reads outputted by third generation sequencing technology such as machines produced by Pacific Biosciences. The reads produced by third generation sequencing technology, as the name suggests, are longer in length than reads produced by next generation sequencing technologies, such as those produced by Illumina. However, long reads are plagued by high error rates, which can cause issues in downstream analysis. Long read correction algorithms reduce the error rate of long reads either through self-correcting methods or using accurate, short reads outputted by next generation sequencing technologies to correct long reads.</span></p><p>Address of the bookmark: <a href="https://github.com/cchauve/lrcstats" rel="nofollow">https://github.com/cchauve/lrcstats</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>

</channel>
</rss>