<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34246?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/34246?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43926/aun-a-new-metric-to-measure-assembly-contiguity</guid>
	<pubDate>Tue, 02 Aug 2022 01:18:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43926/aun-a-new-metric-to-measure-assembly-contiguity</link>
	<title><![CDATA[auN: a new metric to measure assembly contiguity]]></title>
	<description><![CDATA[<p><span>Given a de novo assembly, we often measure the &ldquo;average&rdquo; contig length by N50.&nbsp;</span><a href="https://en.wikipedia.org/wiki/N50,_L50,_and_related_statistics">N50</a><span>&nbsp;is neither the real average nor median. It is the length of the contig such that this and longer contigs cover at least 50% of the assembly. A longer N50 indicates better contiguity. We can similarly define N</span><em>x</em><span>&nbsp;such that contigs no shorter than N</span><em>x</em><span>&nbsp;covers&nbsp;</span><em>x</em><span>% of the assembly. The N</span><em>x</em><span>&nbsp;curve plots N</span><em>x</em><span>&nbsp;as a function of&nbsp;</span><em>x</em><span>, where&nbsp;</span><em>x</em><span>&nbsp;is ranged from 0 to 100.</span></p>
<p><span><img src="http://lh3.github.io/images/NGx_plot.png" alt="image" style="border: 0px;"></span></p><p>Address of the bookmark: <a href="https://lh3.github.io/2020/04/08/a-new-metric-on-assembly-contiguity" rel="nofollow">https://lh3.github.io/2020/04/08/a-new-metric-on-assembly-contiguity</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44377/mitochondrial-genome-assembly-tools</guid>
	<pubDate>Wed, 06 Sep 2023 00:37:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44377/mitochondrial-genome-assembly-tools</link>
	<title><![CDATA[Mitochondrial genome assembly tools !]]></title>
	<description><![CDATA[<p>Mitochondrial genome assembly tools are specialized software and algorithms designed to accurately reconstruct the mitochondrial genome (mitogenome) from sequencing data, typically obtained through techniques like next-generation sequencing (NGS). The mitochondrial genome is relatively small compared to the nuclear genome, making it an ideal target for assembly. Here are some commonly used mitochondrial genome assembly tools:</p><p><strong>MitoFinder:</strong> Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.</p><p><strong>MitoHiFi:</strong> a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads</p><p>MITObim: MITObim is a tool specifically developed for the iterative assembly of mitochondrial genomes. It starts with a reference mitogenome and iteratively refines the assembly using the read data.</p><p><strong>MITOS:</strong> MITOS is a web-based platform that provides a pipeline for annotating mitochondrial genomes. It integrates multiple software tools for assembly, annotation, and visualization of mitogenomes.</p><p><strong>MIRA:</strong> MIRA (Mimicking Intelligent Read Assembly) is a versatile genome assembly tool that can be used for mitochondrial genome assembly. It supports various sequencing technologies and allows for reference-based or de novo assembly.</p><p><strong>NOVOPlasty:</strong> NOVOPlasty is a user-friendly tool designed for de novo assembly of organelle genomes, including mitochondria. It utilizes a seed-and-extend algorithm and is suitable for both short-read and long-read data.</p><p><strong>MITOS2:</strong> MITOS2 is an updated version of the MITOS pipeline, which automates the annotation of mitochondrial genomes. It provides improved accuracy and additional features for mitochondrial genome analysis.</p><p><strong>GetOrganelle:</strong> While primarily designed for chloroplast genome assembly, GetOrganelle can also be used for mitochondrial genome assembly. It is particularly useful for dealing with high-throughput sequencing data.</p><p><strong>SPAdes:</strong> SPAdes (St. Petersburg genome assembler) is a versatile genome assembly tool that can be employed for mitochondrial genome assembly, especially when dealing with complex datasets that may contain nuclear mitochondrial DNA sequences (numts).</p><p><strong>IDBA-UD:</strong> IDBA-UD (Iterative De Bruijn Graph De Novo Assembler) is another de novo assembly tool that can be used for mitochondrial genome assembly, especially in cases with relatively low coverage.</p><p><strong>Velvet:</strong> Velvet is a de novo assembly tool that can be applied to mitochondrial genome assembly, especially when working with short-read data.</p><p>When selecting a mitochondrial genome assembly tool, it's important to consider the specific characteristics of your sequencing data, such as read length and coverage, as well as the complexity of the mitochondrial genome. Additionally, some tools are better suited for specific organisms or research objectives, so choosing the right tool will depend on your particular project requirements.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27113/picard</guid>
	<pubDate>Fri, 29 Apr 2016 08:21:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27113/picard</link>
	<title><![CDATA[Picard]]></title>
	<description><![CDATA[<p>Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the <a href="http://samtools.github.io/hts-specs/">Hts-specs</a> repository. See especially the <a href="http://samtools.github.io/hts-specs/SAMv1.pdf">SAM specification</a> and the <a href="http://samtools.github.io/hts-specs/VCFv4.3.pdf">VCF specification</a>.</p>
<p>Note that the information on this page is targeted at end-users. For developers, the source code, building instructions and implementation/development resources are available on <a href="https://github.com/broadinstitute/picard">GitHub</a>.</p>
<p>The Picard toolkit is open-source under the <a href="https://tldrlegal.com/license/mit-license">MIT license</a> and free for all uses.</p>
<p>Enjoy!</p><p>Address of the bookmark: <a href="http://broadinstitute.github.io/picard/" rel="nofollow">http://broadinstitute.github.io/picard/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26909/sequence-assembly-with-mira-4</guid>
	<pubDate>Wed, 06 Apr 2016 08:21:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26909/sequence-assembly-with-mira-4</link>
	<title><![CDATA[Sequence assembly with MIRA 4]]></title>
	<description><![CDATA[<p>MIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects. MIRA assembles/maps reads gained by</p>
<div>
<ul>
<li>
<p>electrophoresis sequencing (aka Sanger sequencing)</p>
</li>
<li>
<p>454 pyro-sequencing (GS20, FLX or Titanium)</p>
</li>
<li>
<p>Ion Torrent</p>
</li>
<li>
<p>Solexa (Illumina) sequencing</p>
</li>
<li>
<p>(in development) Pacific Biosciences sequencing</p>
</li>
</ul>
</div>
<p>into contiguous sequences (called <span><em>contigs</em></span>). One can use the sequences of different sequencing technologies either in a single assembly run (a <span><em>true hybrid assembly</em></span>) or by mapping one type of data to an assembly of other sequencing type (a <span><em>semi-hybrid assembly (or mapping)</em></span>) or by mapping a data against consensus sequences of other assemblies (a <span><em>simple mapping</em></span>).</p>
<p>The MIRA acronym stands for <span><strong>M</strong></span>imicking <span><strong>I</strong></span>ntelligent <span><strong>R</strong></span>ead <span><strong>A</strong></span>ssembly and the program pretty well does what its acronym says (well, most of the time anyway). It is the Swiss army knife of sequence assembly that I've used and developed during the past 14 years to get assembly jobs I work on done efficiently - and especially accurately. That is, without me actually putting too much manual work into it.</p>
<p>More at http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html</p><p>Address of the bookmark: <a href="http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html" rel="nofollow">http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html</a></p>]]></description>
	<dc:creator>Priya Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26972/understanding-fastqc-output</guid>
	<pubDate>Fri, 15 Apr 2016 05:47:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26972/understanding-fastqc-output</link>
	<title><![CDATA[Understanding Fastqc Output]]></title>
	<description><![CDATA[<p>Understanding Following table and graphs</p>
<ol>
<li>Duplication level</li>
<li>kmer profile</li>
<li>per base GC content</li>
<li>per base N content</li>
<li>per base quality</li>
<li>per base sequence content</li>
<li>per sequence GC content</li>
<li>per sequence quality</li>
<li>sequence length distribution</li>
</ol>
<p>More at http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/</p><p>Address of the bookmark: <a href="http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/" rel="nofollow">http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</guid>
	<pubDate>Tue, 26 Apr 2016 03:38:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</link>
	<title><![CDATA[ALE: a Generic Assembly Likelihood Evaluation Framework for Assessing the Accuracy of Genome and Metagenome Assemblies]]></title>
	<description><![CDATA[<p>Assembly Likelihood Evaluation (ALE) framework that overcomes these limitations, systematically evaluating the accuracy of an assembly in a reference-independent manner using rigorous statistical methods. This framework is comprehensive, and integrates read quality, mate pair orientation and insert length (for paired-end reads), sequencing coverage, read alignment and k-mer frequency. ALE pinpoints synthetic errors in both single and metagenomic assemblies, including single-base errors, insertions/deletions, genome rearrangements and chimeric assemblies presented in metagenomes. At the genome level with real-world data, ALE identifies three large misassemblies from the Spirochaeta smaragdinae finished genome, which were all independently validated by Pacific Biosciences sequencing. At the single-base level with Illumina data, ALE recovers 215 of 222 (97%) single nucleotide variants in a training set from a GC-rich Rhodobacter sphaeroides genome. Using real Pacific Biosciences data, ALE identifies 12 of 12 synthetic errors in a Lambda Phage genome, surpassing even Pacific Biosciences' own variant caller, EviCons. In summary, the ALE framework provides a comprehensive, reference-independent and statistically rigorous measure of single genome and metagenome assembly accuracy, which can be used to identify misassemblies or to optimize the assembly process.</p>
<p>More at&nbsp;http://www.ncbi.nlm.nih.gov/pubmed/23303509</p><p>Address of the bookmark: <a href="http://sc932.github.io/ALE/about.html" rel="nofollow">http://sc932.github.io/ALE/about.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28805/bambus</guid>
	<pubDate>Tue, 16 Aug 2016 08:09:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28805/bambus</link>
	<title><![CDATA[Bambus]]></title>
	<description><![CDATA[<div>
<div>
<div>
<p>Bambus 2.0, the second generation Bambus scaffolder available as an open source package. While most other scaffolders are closely tied to a specific assembly program, Bambus accepts the output from most current assemblers and provides the user with great flexibility in choosing the scaffolding parameters. In particular, Bambus is able to accept contig linking data other than specified by mate-pairs. Such sources of information include alignment to a reference genome (Bambus can directly use the output of MUMmer), physical mapping data, or information about gene synteny.</p>
</div>
</div>
</div>
<div>
<div>Home Page:&nbsp;</div>
<div>
<div><a href="http://sourceforge.net/apps/mediawiki/amos/index.php?title=Bambus2">http://sourceforge.net/apps/mediawiki/amos/index.php?title=Bambus2</a></div>
</div>
</div><p>Address of the bookmark: <a href="https://www.cbcb.umd.edu/software/bambus2" rel="nofollow">https://www.cbcb.umd.edu/software/bambus2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29957/record</guid>
	<pubDate>Fri, 25 Nov 2016 08:23:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29957/record</link>
	<title><![CDATA[RECORD]]></title>
	<description><![CDATA[<p>Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome studied in the experiment and the reference genome of the species. However, in most typical protocols, this information is disregarded and the reference genome is used. Results. We provide a new approach that allows researchers to reconstruct genomes very closely related to the reference genome (e.g., mutants of the same species) directly from the reads used in the experiment. Our approach applies de novo assembly software to experimental reads and so-called pseudoreads and uses the resulting contigs to generate a modified reference sequence. In this way, it can very quickly, and at no additional sequencing cost, generate new, modified reference sequence that is closer to the actual sequenced genome and has a full coverage. In this paper, we describe our approach and test its implementation called RECORD. We evaluate RECORD on both simulated and real data. We made our software publicly available on sourceforge. Conclusion. Our tests show that on closely related sequences RECORD outperforms more general assisted-assembly software.</p>
<p>More at&nbsp;https://sourceforge.net/projects/record-genome-assembler/files/</p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26558255" rel="nofollow">https://www.ncbi.nlm.nih.gov/pubmed/26558255</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30076/sga-string-graph-assembler</guid>
	<pubDate>Thu, 08 Dec 2016 05:08:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30076/sga-string-graph-assembler</link>
	<title><![CDATA[SGA: String Graph Assembler]]></title>
	<description><![CDATA[<p><span>SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.</span></p>
<p><span>More at</span></p>
<p><span>https://github.com/jts/sga</span></p>
<p>SGA dependencies:<br> -google sparse hash library (http://code.google.com/p/google-sparsehash/)<br> -the bamtools library (https://github.com/pezmaster31/bamtools)<br> -zlib (http://www.zlib.net/)<br> -(optional but suggested) the jemalloc memory allocator (http://www.canonware.com/jemalloc/download.html)</p><p>Address of the bookmark: <a href="https://github.com/jts/sga" rel="nofollow">https://github.com/jts/sga</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>