<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34618?offset=50</link>
	<atom:link href="https://bioinformaticsonline.com/related/34618?offset=50" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37576/lrcstats-a-tool-for-evaluating-long-reads-correction-methods</guid>
	<pubDate>Wed, 22 Aug 2018 11:05:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37576/lrcstats-a-tool-for-evaluating-long-reads-correction-methods</link>
	<title><![CDATA[LRCstats: a tool for evaluating long reads correction methods]]></title>
	<description><![CDATA[<p><span>LRCstats is an open-source pipeline for benchmarking DNA long read correction algorithms for long reads outputted by third generation sequencing technology such as machines produced by Pacific Biosciences. The reads produced by third generation sequencing technology, as the name suggests, are longer in length than reads produced by next generation sequencing technologies, such as those produced by Illumina. However, long reads are plagued by high error rates, which can cause issues in downstream analysis. Long read correction algorithms reduce the error rate of long reads either through self-correcting methods or using accurate, short reads outputted by next generation sequencing technologies to correct long reads.</span></p><p>Address of the bookmark: <a href="https://github.com/cchauve/lrcstats" rel="nofollow">https://github.com/cchauve/lrcstats</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</guid>
	<pubDate>Fri, 19 Oct 2018 07:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</link>
	<title><![CDATA[BASE: a practical de novo assembler for large genomes using long NGS reads]]></title>
	<description><![CDATA[<p><span>new&nbsp;</span><em>de novo</em><span>&nbsp;assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.</span></p><p>Address of the bookmark: <a href="https://github.com/dhlbh/BASE" rel="nofollow">https://github.com/dhlbh/BASE</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44171/hairsplitter-assembling-long-reads-in-an-unknown-number-of-haplotypes</guid>
	<pubDate>Wed, 07 Dec 2022 00:13:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44171/hairsplitter-assembling-long-reads-in-an-unknown-number-of-haplotypes</link>
	<title><![CDATA[HairSplitter: assembling long reads in an unknown number of haplotypes]]></title>
	<description><![CDATA[<p>Pros and cons of HairSplitter Limitations of HairSplitter:</p>
<p>Not very fast: it re-polishes the whole assembly&nbsp;</p>
<p>Limited in the number of haplotypes</p>
<p>Strengths of HairSplitter:</p>
<p>Very modular, can be used with any assembler</p>
<p>Naive: makes no assumption on ploidy, parameter-free</p>
<p>Safe: won&rsquo;t artificially duplicate contigs</p>
<p>&nbsp;</p>
<p>HairSplitter splits collapsed assemblies from &ldquo;draft&rdquo; assemblies obtained by any means</p>
<p>HairSplitter can recover haplotypes and distinguish repeated elements</p>
<p>Only needs sequencing reads, potentially error-prone</p>
<p>HairSplitter splits collapsed assemblies from &ldquo;draft&rdquo; assemblies obtained by any means</p>
<p>HairSplitter can recover haplotypes and distinguish repeated elements</p>
<p>Only needs sequencing reads, potentially error-prone</p>
<p>Not really available yet (github.com/RolandFaure/HairSplitter)</p>
<p>https://hal.archives-ouvertes.fr/hal-03864075/file/RolandFaure_presentation_SeqBIM_2022.pdf</p><p>Address of the bookmark: <a href="https://hal.archives-ouvertes.fr/hal-03817928/document" rel="nofollow">https://hal.archives-ouvertes.fr/hal-03817928/document</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</guid>
	<pubDate>Fri, 05 Jan 2018 03:58:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</link>
	<title><![CDATA[Jabba: Hybrid Error Correction for Long Sequencing Reads]]></title>
	<description><![CDATA[<p>Jabba is a hybrid error correction tool to correct third generation (PacBio / ONT) sequencing data, using second generation (Illumina) data.</p>
<p>Input</p>
<p>Jabba takes as input a concatenated de Bruijn graph and a set of sequences:</p>
<p>the de Bruijn graph should appear in fasta format with 1 entry per node, the meta information should be in the format:<br>&gt;NODE <br>the set of sequences should be in fasta or fastq format. These sequences will be corrected (e.g. PacBio reads). The corrections will be written to a file Jabba fasta.<br>The output is a file in fasta format with corrections of the long reads, and additionally a file in the input format containing uncorrected reads.</p>
<p>https://github.com/biointec/jabba/wiki</p>
<p>https://almob.biomedcentral.com/articles/10.1186/s13015-016-0075-7</p><p>Address of the bookmark: <a href="https://github.com/biointec/jabba" rel="nofollow">https://github.com/biointec/jabba</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36921/breakpointer-using-local-mapping-artifacts-to-support-sequence-breakpoint-discovery-from-single-end-reads</guid>
	<pubDate>Tue, 12 Jun 2018 12:41:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36921/breakpointer-using-local-mapping-artifacts-to-support-sequence-breakpoint-discovery-from-single-end-reads</link>
	<title><![CDATA[Breakpointer: using local mapping artifacts to support sequence breakpoint discovery from single-end reads]]></title>
	<description><![CDATA[Breakpointer is a fast tool for locating sequence breakpoints from the alignment of single end reads (SE) produced by next generation sequencing (NGS). It adopts a heuristic method in searching for local mapping signatures created by insertion/deletions (indels) or more complex structural variants(SVs).<p>Address of the bookmark: <a href="https://github.com/ruping/Breakpointer" rel="nofollow">https://github.com/ruping/Breakpointer</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</guid>
	<pubDate>Tue, 22 Nov 2016 04:51:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</link>
	<title><![CDATA[Maq: Mapping and Assembly with Quality]]></title>
	<description><![CDATA[<p><strong>Maq</strong>&nbsp;stands for&nbsp;<em>Mapping and Assembly with Quality</em>&nbsp;It builds assembly by mapping short reads to reference sequences. Maq is a project hosted by&nbsp;<a href="http://sourceforge.net/">SourceForge.net</a>. The project page is available at<a href="http://sourceforge.net/projects/maq/">http://sourceforge.net/projects/maq/</a>. Maq is previously known as mapass2.</p>
<h2>Run Maq Now</h2>
<p>Follow these steps to try Maq. All you need is a reference sequence file in the FASTA format.</p>
<ol>
<li>Prepare a reference sequence (ref.fasta). Better a bacterial genome.</li>
<li>Download maq, maq-data and maqview at the&nbsp;<a href="http://sourceforge.net/project/showfiles.php?group_id=191815">download page</a>.</li>
<li>Copy maq, maq.pl and maq_eval.pl to the $PATH or to the same directory.</li>
<li>Simulate diploid reference and read sequences, map reads, call variants and evaluate the results in one go:
<pre>maq.pl demo ref.fasta calib-30.dat
</pre>
where&nbsp;<em>calib-30.dat</em>&nbsp;is contained in maq-data.</li>
<li>View the alignment:
<pre>cd maqdemo/easyrun;
maqindex -i -c consensus.cns all.map;
maqview -c consensus.cns all.map</pre>
</li>
</ol>
<p><strong>Even for advanced maq users, running `maq.pl demo' is recommended. You may find something helpful.</strong></p><p>Address of the bookmark: <a href="http://maq.sourceforge.net" rel="nofollow">http://maq.sourceforge.net</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</guid>
	<pubDate>Wed, 23 May 2018 06:54:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</link>
	<title><![CDATA[BlasR Mapping single molecule sequencing reads using Basic Local Alignment with Successive Refinement (BLASR): Theory and Application,]]></title>
	<description><![CDATA[<p><span>BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands to tens of thousands of bases long with divergence between the read and genome dominated by insertion and deletion error.</span></p>
<p>Here is how I use the blasr to align PacBio reads to the contigs (target.fasta). The &ldquo;target.fasta.sa&rdquo; is the suffix array from &ldquo;target.fasta&rdquo; generated by sawriter.</p>
<blockquote>
<p>blasr query.fa ./target.fasta -sa ./target.fasta.sa -bestn 40 -maxScore -500 -m 4 -nproc 24 -out target.m4 -maxLCPLength 15</p>
</blockquote>
<p>the output format option &ldquo;-m 4&Prime; generate the alignment coordinate. Not fully documented, but I can explain that to you.&nbsp;</p>
<p>I use a 24 cores / 48G ram server for the alignment. It took about 2 to 3 hours aligning 3G PacBio Reads to 10^6 sequences of short read contigs with a mean 3.5kbp length.</p><p>Address of the bookmark: <a href="http://bix.ucsd.edu/projects/blasr/" rel="nofollow">http://bix.ucsd.edu/projects/blasr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</guid>
	<pubDate>Sat, 20 Mar 2021 00:28:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</link>
	<title><![CDATA[List of bioinformatics packages for NGS analysis !]]></title>
	<description><![CDATA[<p>Package suites gather software packages and installation tools for specific languages or platforms. We have some for bioinformatics software.</p><ul>
<li><a href="https://github.com/Bioconductor">Bioconductor</a>&nbsp;&ndash; A plethora of tools for analysis and comprehension of high-throughput genomic data, including 1500+ software packages. [&nbsp;<a href="https://link.springer.com/article/10.1186/gb-2004-5-10-r80">paper-2004</a>&nbsp;|&nbsp;<a href="https://www.bioconductor.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/biopython/biopython">Biopython</a>&nbsp;&ndash; Freely available tools for biological computing in Python, with included cookbook, packaging and thorough documentation. Part of the&nbsp;<a href="http://open-bio.org/">Open Bioinformatics Foundation</a>. Contains the very useful&nbsp;<a href="https://biopython.org/DIST/docs/api/Bio.Entrez-module.html">Entrez</a>&nbsp;package for API access to the NCBI databases. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/19304878">paper-2009</a>&nbsp;|&nbsp;<a href="https://biopython.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/bioconda">Bioconda</a>&nbsp;&ndash; A channel for the&nbsp;<a href="http://conda.pydata.org/docs/intro.html">conda package manager</a>&nbsp;specializing in bioinformatics software. Includes a repository with 3000+ ready-to-install (with&nbsp;<code>conda install</code>) bioinformatics packages. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/29967506">paper-2018</a>&nbsp;|&nbsp;<a href="https://bioconda.github.io/">web</a>&nbsp;]</li>
<li><a href="https://github.com/BioJulia">BioJulia</a>&nbsp;&ndash; Bioinformatics and computational biology infastructure for the Julia programming language. [&nbsp;<a href="https://biojulia.net/">web</a>&nbsp;]</li>
<li><a href="https://github.com/rust-bio/rust-bio">Rust-Bio</a>&nbsp;&ndash; Rust implementations of algorithms and data structures useful for bioinformatics. [&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/early/2015/10/06/bioinformatics.btv573.short?rss=1">paper-2016</a>&nbsp;]</li>
<li><a href="https://github.com/seqan/seqan3">SeqAn</a>&nbsp;&ndash; The modern C++ library for sequence analysis.</li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34528/cope-an-accurate-k-mer-based-pair-end-reads-connection-tool-to-facilitate-genome-assembly</guid>
	<pubDate>Wed, 06 Dec 2017 02:08:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34528/cope-an-accurate-k-mer-based-pair-end-reads-connection-tool-to-facilitate-genome-assembly</link>
	<title><![CDATA[COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly]]></title>
	<description><![CDATA[<p><span>An efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30&times; simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, which is, respectively, 10 and 2% higher than the recently published tool FLASH. When COPE is applied to real reads for genome assembly, the resulting contigs are found to have fewer errors and give a 14-fold improvement in the N50 measurement when compared with the contigs produced using unconnected reads.</span></p><p>Address of the bookmark: <a href="ftp://ftp.genomics.org.cn/pub/cope" rel="nofollow">ftp://ftp.genomics.org.cn/pub/cope</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</guid>
	<pubDate>Fri, 11 May 2018 05:07:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads]]></title>
	<description><![CDATA[<p>MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads. MECAT employs novel alignment and error correction algorithms that are much more efficient than the state of art of aligners and error correction tools. MECAT can be used for effectively de novo assemblying large genomes. For example, on a 32-thread computer with 2.0 GHz CPU , MECAT takes 9.5 days to assemble a human genome based on 54x SMRT data, which is 40 times faster than the current&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>. MECAT performance were compared with&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>,&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>&nbsp;and&nbsp;<a href="http://canu.readthedocs.io/en/latest/">Canu(v1.3)</a>&nbsp;in five real datasets. The quality of assembled contigs produced by MECAT is the same or better than that of the&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>&nbsp;and&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>.&nbsp;</p>
<p>https://www.nature.com/articles/nmeth.4432</p><p>Address of the bookmark: <a href="https://github.com/xiaochuanle/MECAT" rel="nofollow">https://github.com/xiaochuanle/MECAT</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>