<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/39213?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/39213?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43254/quasr-quantification-and-annotation-of-short-reads-in-r</guid>
	<pubDate>Fri, 13 Aug 2021 07:44:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43254/quasr-quantification-and-annotation-of-short-reads-in-r</link>
	<title><![CDATA[QuasR: Quantification and annotation of short reads in R]]></title>
	<description><![CDATA[<p>The <em><a href="https://bioconductor.org/packages/3.14/QuasR">QuasR</a></em> package (short for <em>Qu</em>antify and <em>a</em>nnotate <em>s</em>hort reads in <em>R</em>) integrates the functionality of several <strong>R</strong> packages (such as <em><a href="https://bioconductor.org/packages/3.14/IRanges">IRanges</a></em> <span>(Lawrence et al. 2013)</span> and <em><a href="https://bioconductor.org/packages/3.14/Rsamtools">Rsamtools</a></em>) and external software (e.g.&nbsp;<code>bowtie</code>, through the <em><a href="https://bioconductor.org/packages/3.14/Rbowtie">Rbowtie</a></em> package, and <code>HISAT2</code>, through the <em><a href="https://bioconductor.org/packages/3.14/Rhisat2">Rhisat2</a></em> package). The package aims to cover the whole analysis workflow of typical high throughput sequencing experiments, starting from the raw sequence reads, over pre-processing and alignment, up to quantification. A single <strong>R</strong> script can contain all steps of a complete analysis, making it simple to document, reproduce or share the workflow containing all relevant details.</p><p>Address of the bookmark: <a href="https://www.bioconductor.org/packages/devel/bioc/vignettes/QuasR/inst/doc/QuasR.html" rel="nofollow">https://www.bioconductor.org/packages/devel/bioc/vignettes/QuasR/inst/doc/QuasR.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36833/bfc-a-standalone-high-performance-tool-for-correcting-sequencing-errors-from-illumina-sequencing-data</guid>
	<pubDate>Thu, 31 May 2018 09:35:23 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36833/bfc-a-standalone-high-performance-tool-for-correcting-sequencing-errors-from-illumina-sequencing-data</link>
	<title><![CDATA[BFC: a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data]]></title>
	<description><![CDATA[BFC is a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data. It is specifically designed for high-coverage whole-genome human data, though also performs well for small genomes.

The BFC algorithm is a variant of the classical spectrum alignment algorithm introduced by Pevzner et al (2001). It uses an exhaustive search to find a k-mer path through a read that minimizes a heuristic objective function jointly considering penalties on correction, quality and k-mer support. This algorithm was first implemented in my fermi assembler and then refined a few times in fermi, fermi2 and now in BFC. In the k-mer counting phase, BFC uses a blocked bloom filter to filter out most singleton k-mers and keeps the rest in a hash table (Melsted and Pritchard, 2011). The use of bloom filter is how BFC is named, though other correctors such as Lighter and Bless actually rely more on bloom filter than BFC.

https://github.com/lh3/bfc<p>Address of the bookmark: <a href="https://github.com/lh3/bfc" rel="nofollow">https://github.com/lh3/bfc</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33461/graphmap-a-highly-sensitive-and-accurate-mapper-for-long-error-prone-reads</guid>
	<pubDate>Wed, 07 Jun 2017 04:18:16 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33461/graphmap-a-highly-sensitive-and-accurate-mapper-for-long-error-prone-reads</link>
	<title><![CDATA[GraphMap - A highly sensitive and accurate mapper for long, error-prone reads]]></title>
	<description><![CDATA[<p>GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html<br><br><strong>Features</strong><br><br>&nbsp;&nbsp;&nbsp; Mapping position agnostic to alignment parameters.<br>&nbsp;&nbsp;&nbsp; Consistently very high sensitivity and precision across different error profiles, rates and sequencing technologies even with default parameters.<br>&nbsp;&nbsp;&nbsp; Circular genome handling to resolve coverage drops near ends of the genome.<br>&nbsp;&nbsp;&nbsp; E-value.<br>&nbsp;&nbsp;&nbsp; Meaningful mapping quality.<br>&nbsp;&nbsp;&nbsp; Various alignment strategies (semiglobal bit-vector and Gotoh, anchored).<br>&nbsp;&nbsp;&nbsp; Overlapping of reads for de novo assembly.<br>&nbsp;&nbsp;&nbsp; Transcriptome mapping through internal construction of a transcriptome from a given genomic reference and a GTF file.<br>&nbsp;&nbsp;&nbsp; ...and much more.<br><br>GraphMap is also used as an overlapper in a new de novo genome assembly project called Ra (https://github.com/mariokostelac/ra-integrate).<br>Ra attempts to create de novo assemblies from raw nanopore and PacBio reads without requiring error correction, for which a highly sensitive overlapper is required.<br><br>Currently, development of a new spliced-alignment mode for mapping RNA-seq reads is under way.<br>Description of the current effort as well as how to reach the experimental implementation can be found here: doc/rnaseq.md.</p><p>Address of the bookmark: <a href="https://github.com/isovic/graphmap" rel="nofollow">https://github.com/isovic/graphmap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36808/whatshap-fast-and-accurate-read-based-phasing</guid>
	<pubDate>Mon, 28 May 2018 09:52:16 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36808/whatshap-fast-and-accurate-read-based-phasing</link>
	<title><![CDATA[WhatsHap: fast and accurate read-based phasing]]></title>
	<description><![CDATA[<p>WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.</p>
<h1>Features<a href="https://whatshap.readthedocs.io/en/latest/#features" title="Permalink to this headline"></a></h1>
<blockquote>
<div>
<ul>
<li>Very accurate results (Martin et al.,&nbsp;<a href="https://doi.org/10.1101/085050">WhatsHap: fast and accurate read-based phasing</a>)</li>
<li>Works well with Illumina, PacBio, Oxford Nanopore and other types of reads</li>
<li>It phases SNVs, indels and even &ldquo;complex&rdquo; variants (such as&nbsp;<code><span>TCG</span></code>&nbsp;&rarr;&nbsp;<code><span>AGAA</span></code>)</li>
<li>Pedigree phasing mode uses reads from related individuals (such as trios) to improve results and to reduce coverage requirements (Garg et al.,&nbsp;<a href="https://doi.org/10.1093/bioinformatics/btw276">Read-Based Phasing of Related Individuals</a>).</li>
<li>WhatsHap is&nbsp;<a href="https://whatshap.readthedocs.io/en/latest/installation.html#installation">easy to install</a></li>
<li>It is&nbsp;<a href="https://whatshap.readthedocs.io/en/latest/guide.html#user-guide">easy to use</a>: Pass in a VCF and one or more BAM files, get out a phased VCF. Supports multi-sample VCFs.</li>
<li>It produces standard-compliant VCF output by default</li>
<li>If desired, get output that is compatible with ReadBackedPhasing</li>
<li>Open Source (MIT license)</li>
</ul>
</div>
</blockquote><p>Address of the bookmark: <a href="https://whatshap.readthedocs.io/en/latest/" rel="nofollow">https://whatshap.readthedocs.io/en/latest/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</guid>
	<pubDate>Sat, 15 Feb 2020 01:49:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</link>
	<title><![CDATA[CLARK: Fast, accurate and versatile sequence classification system]]></title>
	<description><![CDATA[<p><span></span><a href="http://dx.doi.org/10.1186/s12864-015-1419-2"><strong>CLARK</strong></a><span>, a method based on a supervised sequence classification using discriminative&nbsp;</span><em>k</em><span>-mers. Considering two distinct specific classification problems (see the article for details), namely (1) the taxonomic classification of metagenomic reads to known bacterial genomes, and (2) the assignment of BAC clones and transcript to chromosome arms/centromeres (in the absence of a finished assembly for the reference genome), CLARK outperforms in classification speed and precision the best state-of-the-art methods.</span></p>
<p><span><a href="http://clark.cs.ucr.edu/Spaced/">http://clark.cs.ucr.edu/Spaced/</a></span></p><p>Address of the bookmark: <a href="http://clark.cs.ucr.edu/Spaced/" rel="nofollow">http://clark.cs.ucr.edu/Spaced/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44641/heliano-a-fast-and-accurate-tool-for-detection-of-helitron-like-elements</guid>
	<pubDate>Tue, 13 Aug 2024 07:16:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44641/heliano-a-fast-and-accurate-tool-for-detection-of-helitron-like-elements</link>
	<title><![CDATA[HELIANO: A fast and accurate tool for detection of Helitron-like elements]]></title>
	<description><![CDATA[<p><span>Helitron-like elements (HLE1 and HLE2) are DNA transposons. They have been found in diverse species and seem to play significant roles in the evolution of host genomes. Although known for over twenty years, Helitron sequences are still challenging to identify. Here, we propose HELIANO (Helitron-like elements annotator) as an efficient solution for detecting Helitron-like elements.</span></p>
<p>https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae679/7730539?login=true</p><p>Address of the bookmark: <a href="https://github.com/Zhenlisme/heliano/" rel="nofollow">https://github.com/Zhenlisme/heliano/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29142/opera-optimal-paired-end-read-assembler</guid>
	<pubDate>Fri, 09 Sep 2016 05:28:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29142/opera-optimal-paired-end-read-assembler</link>
	<title><![CDATA[OPERA : Optimal Paired-End Read Assembler]]></title>
	<description><![CDATA[<p>OPERA (Optimal Paired-End Read Assembler) is a sequence assembly program (<a href="http://en.wikipedia.org/wiki/Sequence_assembly">http://en.wikipedia.org/wiki/Sequence_assembly</a>). It uses information from paired-end/mate-pair/long reads to order and orient the intermediate contigs/scaffolds assembled in a genome assembly project, in a process known as Scaffolding. OPERA is based on an exact algorithm that is guaranteed to minimize the discordance of scaffolds with the information provided by the paired-end/mate-pair/long reads (for further details see Gao et al, 2011).</p>
<p>Note that since the original publication, we have made significant changes to OPERA (v1.0 onwards) including refinements to its basic algorithm (to reduce local errors, improve efficiency etc.) and incorporated features that are important for scaffolding large genomes (multi-library support, better repeat-handling etc.), in addition to other scalability and usability improvements (bam and gzip support, smaller memory footprint). We therefore encourage you to download and use our latest version: OPERA-LG. In our benchmarks, it has significantly improved corrected N50 and reduced the number of scaffolding errors. Furthermore, our latest release contains the wrapper script OPERA-long-read that enables scaffolding with long-reads from third-generation sequencing technologies (PacBio or Oxford Nanopore). The manuscript describing the new features and algorithms is available at&nbsp;<a href="https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0951-y">Genome Biology</a>. We look forward to getting your feedback to improve it further.</p><p>Address of the bookmark: <a href="https://sourceforge.net/p/operasf/wiki/The%20OPERA%20wiki/" rel="nofollow">https://sourceforge.net/p/operasf/wiki/The%20OPERA%20wiki/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34702/run-miniasm-assembler-on-nanopore-reads</guid>
	<pubDate>Mon, 18 Dec 2017 04:07:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34702/run-miniasm-assembler-on-nanopore-reads</link>
	<title><![CDATA[Run miniasm assembler on nanopore reads !]]></title>
	<description><![CDATA[<p>Miniasm is a very fast OLC-based&nbsp;<em>de novo</em>&nbsp;assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by&nbsp;<a href="https://github.com/lh3/minimap">minimap</a>) as input and outputs an assembly graph in the&nbsp;<a href="https://github.com/pmelsted/GFA-spec/blob/master/GFA-spec.md">GFA</a>&nbsp;format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final&nbsp;<a href="http://wgs-assembler.sourceforge.net/wiki/index.php/Celera_Assembler_Terminology">unitig</a>&nbsp;sequences. Thus the per-base error rate is similar to the raw input reads.</p><p>Find the detail of the reads repeats:</p><blockquote><p>fq2fa ONT_A.fastq ONT_A.fasta&nbsp;<br /><br />minimap2 -xava-ont ONT_A.fasta ONT_A.fasta -t10 -X &gt; AONT.paf&nbsp;<br /><br />awk '{if($1==$6){print}}' AONT.paf &gt; AONTself.paf&nbsp;<br /><br />awk '$5=="-"' AONTself.paf | awk '{print $1}'| sort|uniq &gt; invertedrepeat.list</p></blockquote><p>Generated a few palindrome and repeats plots (highlighting only repeats largest than 10, 20 and 30 kb)</p><blockquote><p>minidot -f 5 -m 30000 AONTself.paf &gt; AONTself30000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself30000.eps &gt; AONTself30000final.eps&nbsp;<br /><br />minidot -f 5 -m 20000 AONTself.paf &gt; AONTself20000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself20000.eps &gt; AONTself20000final.eps&nbsp;<br /><br />minidot -f 5 -m 10000 AONTself.paf &gt; AONTself10000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself10000.eps &gt; AONTself10000final.eps&nbsp;</p></blockquote><p>Assemble with miniasm:</p><blockquote><p>miniasm -f ONT_A.fasta AONT.paf &gt; AONT.gfa&nbsp;</p><p>grep '^S' AONT.gfa |awk '{print "&gt;"$2"\n"$3}' &gt; AONT_miniasm.fasta&nbsp;<br /><br />minimap2 -xasm10 AONT_miniasm.fasta AONT_miniasm.fasta -t1 -X &gt; AONT_miniasm.paf&nbsp;<br /><br />awk '{if($1==$6){print}}' AONT_miniasm.paf &gt; AONT_miniasm_self.paf&nbsp;<br /><br />minidot -f 5 -m 10000 AONT_miniasm_self.paf &gt; AONT_miniasm_self10000.eps&nbsp;</p></blockquote><p>Njoy the assembly !</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</guid>
	<pubDate>Tue, 14 Jan 2020 06:47:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</link>
	<title><![CDATA[Shasta long read assembler]]></title>
	<description><![CDATA[<p>The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by&nbsp;<a href="https://nanoporetech.com/">Oxford Nanopore</a>&nbsp;flow cells.</p>
<p>Computational methods used by the Shasta assembler include:</p>
<ul>
<li>Using a&nbsp;<a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length</a>&nbsp;representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer repeat counts, which are the most common type of errors in Oxford Nanopore reads.</li>
<li>Using in some phases of the computation a representation of the read sequence based on&nbsp;<em>markers</em>, a fixed subset of short k-mers (k &asymp; 10).</li>
</ul>
<p>More at&nbsp;<a href="https://chanzuckerberg.github.io/shasta/index.html">https://chanzuckerberg.github.io/shasta/index.html</a></p><p>Address of the bookmark: <a href="https://github.com/chanzuckerberg/shasta" rel="nofollow">https://github.com/chanzuckerberg/shasta</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/2726/comparison-of-short-read-de-novo-alignment-algorithms</guid>
	<pubDate>Wed, 21 Aug 2013 07:56:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/2726/comparison-of-short-read-de-novo-alignment-algorithms</link>
	<title><![CDATA[Comparison of Short Read De Novo Alignment Algorithms]]></title>
	<description><![CDATA[<p>Excellent article to introduce different sequencing methods along with tools for de novo assembly of sequencing reads and their relevant references.</p>
<p>Title:&nbsp;<strong>Comparison of Short Read De Novo Alignment Algorithms&nbsp;</strong></p>
<p>Author<strong>: Nikhil Gopal</strong></p><p>Address of the bookmark: <a href="http://biochem218.stanford.edu/Projects%202011/Gopal%202011.pdf" rel="nofollow">http://biochem218.stanford.edu/Projects%202011/Gopal%202011.pdf</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>