<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36817?offset=0</link>
	<atom:link href="https://bioinformaticsonline.com/related/36817?offset=0" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</guid>
	<pubDate>Thu, 26 Aug 2021 10:28:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</link>
	<title><![CDATA[k-mers tutorial - classification and taxonomy]]></title>
	<description><![CDATA[<p>DNA k-mers underlie much of our assembly work, and we (along with many others!) have spent a lot of time thinking about how to&nbsp;<a href="http://www.pnas.org/content/109/33/13272">store k-mer graphs efficiently</a>,&nbsp;<a href="http://ivory.idyll.org/blog/what-is-diginorm.html">discard redundant data</a>, and&nbsp;<a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0101271">count them efficiently</a>.</p>
<p>More recently, we've been enthused about&nbsp;<a href="http://joss.theoj.org/papers/3d793c6e7db683bee7c03377a4a7f3c9">using k-mer based similarity measures</a>&nbsp;and&nbsp;<a href="http://ivory.idyll.org/blog/2016-sourmash-sbt.html">computing and searching k-mer-based sketch search databases for all the things</a>.</p>
<p>But I haven't spent too much talking about using k-mers for taxonomy, although that has become an&nbsp;<em>ahem</em>&nbsp;area of interest recently,&nbsp;<a href="http://www.biorxiv.org/content/early/2017/07/03/155358">if you read into our papers a bit</a>.</p>
<p>In this blog post I'm going to fix this by doing a little bit of a literature review and waxing enthusiastic about other people's work. Then in a future blog post I'll talk about how we're building off of this work in fun! and interesting? ways!</p><p>Address of the bookmark: <a href="http://ivory.idyll.org/blog/2017-something-about-kmers.html" rel="nofollow">http://ivory.idyll.org/blog/2017-something-about-kmers.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/1926</guid>
	<pubDate>Sun, 11 Aug 2013 11:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/view/1926</link>
	<title><![CDATA[Want to Know which genome assembler rule the world ?]]></title>
	<description><![CDATA[<p><span><strong>Assemblathon 2</strong>: evaluating de novo methods of genome assembly&nbsp;</span></p><p><span><a href="http://www.gigasciencejournal.com/content/2/1/10/abstract">http://www.gigasciencejournal.com/content/2/1/10/abstract</a></span></p><p><span><a href="http://blogs.nature.com/news/2013/07/genome-assembly-contest-prompts-soul-searching.html">http://blogs.nature.com/news/2013/07/genome-assembly-contest-prompts-soul-searching.html</a></span></p><p><a href="http://assemblathon.org/post/44431915644/feedback-and-analysis-of-the-assemblathon-2-p">http://assemblathon.org/post/44431915644/feedback-and-analysis-of-the-assemblathon-2-p</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34519/bandage-interactive-visualization-of-de-novo-genome-assemblies</guid>
	<pubDate>Mon, 04 Dec 2017 10:09:37 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34519/bandage-interactive-visualization-of-de-novo-genome-assemblies</link>
	<title><![CDATA[Bandage: interactive visualization of de novo genome assemblies]]></title>
	<description><![CDATA[<p>Bandage (a Bioinformatics Application for Navigating&nbsp;<em>De&nbsp;novo</em>&nbsp;Assembly Graphs Easily) is a tool for visualizing assembly graphs with connections. Users can zoom in to specific areas of the graph and interact with it by moving nodes, adding labels, changing colors and extracting sequences. BLAST searches can be performed within the Bandage graphical user interface and the hits are displayed as highlights in the graph. By displaying connections between contigs, Bandage presents new possibilities for analyzing&nbsp;<em>de novo</em>&nbsp;assemblies that are not possible through investigation of contigs alone.</p>
<p><strong>Availability and implementation:</strong>&nbsp;Source code and binaries are freely available at&nbsp;<a href="https://github.com/rrwick/Bandage" target="pmc_ext">https://github.com/rrwick/Bandage</a>. Bandage is implemented in C++ and supported on Linux, OS X and Windows. A full feature list and screenshots are available at&nbsp;<a href="http://rrwick.github.io/Bandage" target="pmc_ext">http://rrwick.github.io/Bandage</a>.</p><p>Address of the bookmark: <a href="http://rrwick.github.io/Bandage/" rel="nofollow">http://rrwick.github.io/Bandage/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36476/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</guid>
	<pubDate>Fri, 04 May 2018 19:16:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36476/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[Flye: Fast and accurate de novo assembler for single molecule sequencing reads]]></title>
	<description><![CDATA[<p><span>Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies. The algorithm uses an A-Bruijn graph to find the overlaps between reads and does not require them to be error-corrected. After the initial assembly, Flye performs an extra repeat classification and analysis step to improve the structural accuracy of the resulting sequence. The package also includes a polisher module, which produces the final assembly of high nucleotide-level quality.</span></p><p>Address of the bookmark: <a href="https://github.com/fenderglass/Flye" rel="nofollow">https://github.com/fenderglass/Flye</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</guid>
	<pubDate>Tue, 03 Jul 2018 04:09:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</link>
	<title><![CDATA[ASplice: a scalable and memory-efficient algorithm for de novo transcriptome assembly]]></title>
	<description><![CDATA[With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries.

Texas A&amp;M University researchers develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory.

Availability – A software program that implements the algorithm is available at: http://faculty.cse.tamu.edu/shsze/asplice.

Sze SH, Pimsler ML, Tomberlin JK, Jones CD, Tarone AM. (2017) A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. BMC Genomics 18(Suppl 4):387.<p>Address of the bookmark: <a href="http://faculty.cse.tamu.edu/shsze/asplice/" rel="nofollow">http://faculty.cse.tamu.edu/shsze/asplice/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</guid>
	<pubDate>Fri, 19 Oct 2018 07:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</link>
	<title><![CDATA[BASE: a practical de novo assembler for large genomes using long NGS reads]]></title>
	<description><![CDATA[<p><span>new&nbsp;</span><em>de novo</em><span>&nbsp;assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.</span></p><p>Address of the bookmark: <a href="https://github.com/dhlbh/BASE" rel="nofollow">https://github.com/dhlbh/BASE</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39213/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</guid>
	<pubDate>Tue, 02 Apr 2019 21:54:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39213/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[Flye: Fast and accurate de novo assembler for single molecule sequencing reads]]></title>
	<description><![CDATA[<p><span>Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PB / ONT reads as input and outputs polished contigs. Flye also includes a special mode for metagenome assembly.</span></p><p>Address of the bookmark: <a href="https://github.com/fenderglass/Flye" rel="nofollow">https://github.com/fenderglass/Flye</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40516/nextdenovo-string-graph-based-de-novo-assembler-for-tgs-long-reads</guid>
	<pubDate>Sun, 05 Jan 2020 04:08:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40516/nextdenovo-string-graph-based-de-novo-assembler-for-tgs-long-reads</link>
	<title><![CDATA[NextDenovo: string graph-based de novo assembler for TGS long reads]]></title>
	<description><![CDATA[<p>NextDenovo is a string graph-based<span>&nbsp;</span><em>de novo</em><span>&nbsp;</span>assembler for TGS long reads. It uses a "correct-then-assemble" strategy similar to canu, but requires significantly less computing resources and storages. After assembly, the per-base error rate is about 97-98%, to further improve single base accuracy, please use<span>&nbsp;</span><a href="https://github.com/Nextomics/NextPolish">NextPolish</a>.</p>
<p>NextDenovo contains two core modules: NextCorrect and NextGraph. NextCorrect can be used to correct TGS long reads with approximately 15% sequencing errors, and NextGraph can be used to construct a string graph with corrected reads. It also contains a modified version of<span>&nbsp;</span><a href="https://github.com/lh3/minimap2">minimap2</a><span>&nbsp;</span>for adapting input and output and producing more sensitive and accurate dovetail overlaps, and some useful utilities (see<span>&nbsp;</span><a href="https://github.com/Nextomics/NextDenovo/blob/master/doc/UTILITY.md">here</a><span>&nbsp;</span>for more details).</p><p>Address of the bookmark: <a href="https://github.com/Nextomics/NextDenovo" rel="nofollow">https://github.com/Nextomics/NextDenovo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44529/contigextender-a-new-approach-to-improving-de-novo-sequence-assembly-for-viral-metagenomics-data</guid>
	<pubDate>Wed, 08 May 2024 07:32:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44529/contigextender-a-new-approach-to-improving-de-novo-sequence-assembly-for-viral-metagenomics-data</link>
	<title><![CDATA[ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data]]></title>
	<description><![CDATA[<p dir="auto">ContigExtender, was developed to extend contigs, complementing de novo assembly. ContigExtender employs a novel recursive Overlap Layout Candidates (r-OLC) strategy that explores multiple extending paths to achieve longer and highly accurate contigs. ContigExtender is effective for extending contigs significantly in in silico synthesized and real metagenomics datasets.</p>
<p dir="auto">More at&nbsp;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7953547/</p>
<p dir="auto"><a href="https://camo.githubusercontent.com/72dc78177cd84dd0c667a2922a9fd984fb548b5ec94b11f9a547211a4adba3b1/68747470733a2f2f692e696d6775722e636f6d2f7734516944496a2e706e67" target="_blank"><img src="https://camo.githubusercontent.com/72dc78177cd84dd0c667a2922a9fd984fb548b5ec94b11f9a547211a4adba3b1/68747470733a2f2f692e696d6775722e636f6d2f7734516944496a2e706e67" alt="extension process" title="extension process" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://github.com/dengzac/contig-extender" rel="nofollow">https://github.com/dengzac/contig-extender</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>