<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/41599?offset=120</link>
	<atom:link href="https://bioinformaticsonline.com/related/41599?offset=120" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/37396/converting-a-vcf-into-a-fasta-given-some-reference</guid>
	<pubDate>Fri, 20 Jul 2018 10:03:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/37396/converting-a-vcf-into-a-fasta-given-some-reference</link>
	<title><![CDATA[Converting a VCF into a FASTA given some reference !]]></title>
	<description><![CDATA[<p>Samtools/BCFtools (Heng Li) provides a Perl script&nbsp;<a href="https://github.com/lh3/samtools/blob/master/bcftools/vcfutils.pl"><code>vcfutils.pl</code></a>&nbsp;which does this, the function&nbsp;<code>vcf2fq</code>&nbsp;(lines 469-528)</p><p>This script has been modified by others to convert InDels as well, e.g.&nbsp;<a href="https://github.com/gringer/bioinfscripts/blob/master/vcf2fq.pl">this</a>&nbsp;by David Eccles</p><pre><code><span>./</span><span>vcf2fq</span><span>.</span><span>pl </span><span>-</span><span>f </span><span>&lt;</span><span>input</span><span>.</span><span>fasta</span><span>&gt;</span><span> </span><span>&lt;</span><span>all</span><span>-</span><span>site</span><span>.</span><span>vcf</span><span>&gt;</span><span> </span><span>&gt;</span><span> </span><span>&lt;</span><span>output</span><span>.</span><span>fastq</span><span>&gt;</span></code></pre><p>https://github.com/gringer/bioinfscripts/blob/master/vcf2fq.pl</p><p>https://github.com/lh3/samtools/blob/master/bcftools/vcfutils.pl</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38210/skesa-strategic-k-mer-extension-for-scrupulous-assemblies</guid>
	<pubDate>Wed, 14 Nov 2018 04:45:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38210/skesa-strategic-k-mer-extension-for-scrupulous-assemblies</link>
	<title><![CDATA[SKESA: strategic k-mer extension for scrupulous assemblies]]></title>
	<description><![CDATA[<p><span>SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and contiguity, handles low-level contamination in reads, is fast, and produces an identical assembly for the same input when assembled multiple times with the same or different compute resources. </span></p>
<p><span>Source code for SKESA is freely available at&nbsp;</span><span><a href="https://github.com/ncbi/SKESA/releases"><span>https://github.com/ncbi/SKESA/releases</span></a></span><span>.</span></p>
<p>Research Paper&nbsp;@ <a href="https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1540-z">Link</a></p>
<p><span><span>SKESA algorithm are as follows:</span><br></span></p>
<p><span><img src="https://media.springernature.com/lw785/springer-static/image/art%3A10.1186%2Fs13059-018-1540-z/MediaObjects/13059_2018_1540_Fig4_HTML.png" alt="image" width="785" height="984" style="border: 0px; border: 0px;"></span></p><p>Address of the bookmark: <a href="https://github.com/ncbi/SKESA/releases" rel="nofollow">https://github.com/ncbi/SKESA/releases</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38526/versatile-genome-assembly-evaluation-with-quast-lg</guid>
	<pubDate>Fri, 21 Dec 2018 22:06:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38526/versatile-genome-assembly-evaluation-with-quast-lg</link>
	<title><![CDATA[Versatile genome assembly evaluation with QUAST-LG]]></title>
	<description><![CDATA[<p>QUAST-LG is an extension of&nbsp;<a href="http://cab.spbu.ru/software/quast/">QUAST</a>&nbsp;intended for evaluating large-scale genome assemblies (up to mammalian-size).</p>
<p>QUAST-LG&nbsp;is included in the QUAST&nbsp; package starting from version 5.0.0 (<a href="https://sourceforge.net/projects/quast/files/latest/download?source=files">download the latest release</a>). Run QUAST as usual and do not forget to add&nbsp;<span>‐‐large</span>&nbsp;option to your command!</p>
<p>A short list of the new features (see&nbsp;<a href="http://cab.spbu.ru/files/quast/latest-docs/CHANGES.txt">CHANGES</a>&nbsp;for all):</p>
<ul>
<li>Significant speedup achieved by both&nbsp;use of new fast aligner (<a href="https://github.com/lh3/minimap2">minimap2</a>) and the refactoring of alignment analyzing&nbsp;modules</li>
<li>New k-mer-based completeness and correctness metrics</li>
<li>BUSCO added for enhanced reference-free analysis</li>
<li>The concept of upper bound&nbsp;assembly (theoretical limits on the assembly&nbsp;completeness and&nbsp;contiguity for a given genome and set of reads)</li>
</ul><p>Address of the bookmark: <a href="http://cab.spbu.ru/software/quast-lg/" rel="nofollow">http://cab.spbu.ru/software/quast-lg/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly</guid>
	<pubDate>Tue, 22 Jan 2019 09:39:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly</link>
	<title><![CDATA[List of tools frequently used while genome assembly]]></title>
	<description><![CDATA[<h4>List of tools frequently used while genome assembly:</h4><p>I have used the following assemblers</p><ul>
<li><a href="http://bioinf.spbau.ru/spades">Spades</a>&nbsp;(v. 3.10.1)</li>
<li><a href="http://canu.readthedocs.io/en/stable/index.html">CANU</a>&nbsp;(v. 1.6)</li>
<li><a href="https://github.com/rrwick/Unicycler">Unicycler&nbsp;</a>(v. v0.4.1)</li>
<li><a href="https://github.com/lh3/miniasm">Miniasm</a>&nbsp;(v. 0.2-r137-dirty)</li>
</ul><p>I have used the following mappers</p><ul>
<li><a href="https://github.com/lh3/minimap2">minimap2</a>&nbsp;(v.&nbsp;2.0rc1-r232)</li>
<li><a href="https://github.com/lh3/minimap">minimap&nbsp;</a>(v. 0.2-r124-dirty)</li>
<li><a href="https://github.com/lh3/bwa">bwa</a>&nbsp;(v.&nbsp;0.7.12-r1039)</li>
</ul><p>I have used the following polishing tools</p><ul>
<li><a href="https://github.com/isovic/racon">Racon</a>&nbsp;(v. not available)</li>
<li><a href="https://github.com/broadinstitute/pilon">Pilon</a>&nbsp;(v. 1.18)</li>
<li><a href="https://github.com/jts/nanopolish">Nanopolish</a>&nbsp;(v. 0.8.3)</li>
</ul><p>I have used the following tools to assess genome assembly characteristics</p><ul>
<li><a href="https://github.com/chjp/ANI">ANI.pl</a>&nbsp;(https://github.com/chjp/ANI)</li>
<li><a href="http://ecogenomics.github.io/CheckM/">CheckM</a>&nbsp;(v. 1.0.7)</li>
<li><a href="https://github.com/tseemann/prokka">Prokka</a>&nbsp;(v. 1.12)</li>
<li><a href="http://bioinf.spbau.ru/en/quast">QUAST</a>&nbsp;(v. 2.3)</li>
<li><a href="http://mummer.sourceforge.net/">mummer&nbsp;</a>(v. not available)</li>
</ul><p>If you have any ideas or superior tools we have missed please let us know in the comments.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38892/wtdbg2-a-fuzzy-bruijn-graph-approach-to-long-noisy-reads-assembly</guid>
	<pubDate>Mon, 04 Feb 2019 04:53:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38892/wtdbg2-a-fuzzy-bruijn-graph-approach-to-long-noisy-reads-assembly</link>
	<title><![CDATA[wtdbg2: A fuzzy Bruijn graph approach to long noisy reads assembly]]></title>
	<description><![CDATA[<p><span>Wtdbg2 is a&nbsp;</span><em>de novo</em><span>&nbsp;sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). It assembles raw reads without error correction and then builds the consensus from intermediate assembly output.&nbsp;</span></p>
<pre>./wtdbg2 -x rs -g 4.6m -t 16 -i reads.fa.gz -fo prefix
./wtpoa-cns -t 16 -i prefix.ctg.lay.gz -fo prefix.ctg.fa</pre><p>Address of the bookmark: <a href="https://github.com/ruanjue/wtdbg2" rel="nofollow">https://github.com/ruanjue/wtdbg2</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41207/blobtoolkit-a-toolkit-for-genome-assembly-qc</guid>
	<pubDate>Fri, 21 Feb 2020 00:17:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41207/blobtoolkit-a-toolkit-for-genome-assembly-qc</link>
	<title><![CDATA[BlobToolKit: A toolkit for genome assembly QC]]></title>
	<description><![CDATA[<p>Filtering raw genomic datasets is essential to avoid chimeric assemblies and to increase the validity of sequence-based biological inference. BlobToolKit extends the BlobTools<span>1</span>/Blobology<span>2</span>&nbsp;approach to simplify interactive and reproducible filtering.</p>
<p>BlobToolKit is comprised of four components:</p>
<ol>
<li><a href="https://blobtoolkit.genomehubs.org/btk-viewer/">BlobToolKit Viewer</a>&nbsp;allows browser-based interactive visualisation and filtering of preliminary or published genomic datasets even for highly fragmented assemblies.</li>
<li><a href="https://blobtoolkit.genomehubs.org/blobtools2/">BlobTools2</a>&nbsp;is a command-line program to convert assemblies and analysis results into datasets that can be further processed using&nbsp;<a href="https://blobtoolkit.genomehubs.org/blobtools2/">BlobTools2</a>&nbsp;and/or visualised in the Viewer.</li>
<li>The&nbsp;<a href="https://blobtoolkit.genomehubs.org/specification/">BlobToolKit Specification</a>&nbsp;features a formal schema and validator for the JSON-based BlobDir format used by&nbsp;<a href="https://blobtoolkit.genomehubs.org/blobtools2/">BlobTools2</a>&nbsp;and the&nbsp;<a href="https://blobtoolkit.genomehubs.org/btk-viewer/">Viewer</a>.</li>
<li>The&nbsp;<a href="https://blobtoolkit.genomehubs.org/pipeline/">BlobToolKit Pipeline</a>&nbsp;is a configurable Snakemake pipeline that automates all steps from retrieving public datasets through running analyses and generating a BlobDir dataset with&nbsp;<a href="https://blobtoolkit.genomehubs.org/blobtools2/">BlobTools2</a>, ready for visualisation in the&nbsp;<a href="https://blobtoolkit.genomehubs.org/btk-viewer/">Viewer</a>.</li>
</ol>
<p>Paper&nbsp;<a href="https://www.biorxiv.org/content/10.1101/844852v1.full.pdf">https://www.biorxiv.org/content/10.1101/844852v1.full.pdf</a></p><p>Address of the bookmark: <a href="https://blobtoolkit.genomehubs.org/" rel="nofollow">https://blobtoolkit.genomehubs.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41501/hicanu-accurate-assembly-of-segmental-duplications-satellites-and-allelic-variants-from-high-fidelity-long-reads</guid>
	<pubDate>Fri, 27 Mar 2020 22:49:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41501/hicanu-accurate-assembly-of-segmental-duplications-satellites-and-allelic-variants-from-high-fidelity-long-reads</link>
	<title><![CDATA[HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads]]></title>
	<description><![CDATA[<p><span>HiCanu, a significant modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering.&nbsp;</span></p>
<p>More at&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.03.14.992248v3?fbclid=IwAR2PaN4GLjvAZpWmCE2q0EWk2dtwY7wiKxVlXn9PPG7OBSP06PP2gcCrv3A">https://www.biorxiv.org/content/10.1101/2020.03.14.992248v3</a></p><p>Address of the bookmark: <a href="https://github.com/marbl/canu" rel="nofollow">https://github.com/marbl/canu</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</guid>
	<pubDate>Wed, 23 Jun 2021 07:54:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</link>
	<title><![CDATA[LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data]]></title>
	<description><![CDATA[<p>LoReTTA (Long Read Template-Targeted Assembler), a tool designed for performing <em>de novo</em> assembly of long reads generated from viral genomes on the PacBio platform. LoReTTA exploits a reference genome to guide the assembly process, an approach that has been successful with short reads.</p>
<p>https://academic.oup.com/ve/article/7/1/veab042/6248116</p><p>Address of the bookmark: <a href="https://academic.oup.com/ve/article/7/1/veab042/6248116" rel="nofollow">https://academic.oup.com/ve/article/7/1/veab042/6248116</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/13267/the-genome-10k-project</guid>
	<pubDate>Tue, 29 Jul 2014 09:11:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/13267/the-genome-10k-project</link>
	<title><![CDATA[The Genome 10K Project]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/B57xDIGtCT0" frameborder="0" allowfullscreen></iframe>https://genome10k.soe.ucsc.edu

The Genome 10K project aims to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus. The trajectory of cost reduction in DNA sequencing suggests that this project will be feasible within a few years. Capturing the genetic diversity of vertebrate species would create an unprecedented resource for the life sciences and for worldwide conservation efforts.

The growing Genome 10K Community of Scientists (G10KCOS), made up of leading scientists representing major zoos, museums, research centers, and universities around the world, is dedicated to coordinating efforts in tissue specimen collection that will lay the groundwork for a large-scale sequencing and analysis project.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</guid>
	<pubDate>Wed, 23 Mar 2016 05:53:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</link>
	<title><![CDATA[RNA-Seq De novo Assembly Using Trinity]]></title>
	<description><![CDATA[<p>Trinity, developed at the <a href="http://www.broadinstitute.org">Broad Institute</a> and the <a href="http://www.cs.huji.ac.il">Hebrew University of Jerusalem</a>, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:</p>
<ul>
<li>
<p><em>Inchworm</em> assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.</p>
</li>
<li>
<p><em>Chrysalis</em> clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.</p>
</li>
<li>
<p><em>Butterfly</em> then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.</p>
</li>
</ul>
<p>More at https://github.com/trinityrnaseq/trinityrnaseq/wiki</p>
<p>......................................................................................................................................</p>
<p>Download Trinity <a href="https://github.com/trinityrnaseq/trinityrnaseq/releases">here</a>.</p>
<p>Build Trinity by typing 'make' in the base installation directory.</p>
<p>Assemble RNA-Seq data like so:</p>
<pre><code> Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G 
</code></pre>
<p>Find assembled transcripts as: 'trinity_out_dir/Trinity.fasta'</p><p>Address of the bookmark: <a href="https://github.com/trinityrnaseq/trinityrnaseq/wiki" rel="nofollow">https://github.com/trinityrnaseq/trinityrnaseq/wiki</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>

</channel>
</rss>