<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37554?offset=30</link>
	<atom:link href="https://bioinformaticsonline.com/related/37554?offset=30" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</guid>
	<pubDate>Tue, 03 Jul 2018 04:09:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</link>
	<title><![CDATA[ASplice: a scalable and memory-efficient algorithm for de novo transcriptome assembly]]></title>
	<description><![CDATA[With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries.

Texas A&amp;M University researchers develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory.

Availability – A software program that implements the algorithm is available at: http://faculty.cse.tamu.edu/shsze/asplice.

Sze SH, Pimsler ML, Tomberlin JK, Jones CD, Tarone AM. (2017) A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. BMC Genomics 18(Suppl 4):387.<p>Address of the bookmark: <a href="http://faculty.cse.tamu.edu/shsze/asplice/" rel="nofollow">http://faculty.cse.tamu.edu/shsze/asplice/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40856/3d-de-novo-assembly-3d-dna-pipeline</guid>
	<pubDate>Sun, 02 Feb 2020 13:41:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40856/3d-de-novo-assembly-3d-dna-pipeline</link>
	<title><![CDATA[3D de novo assembly (3D DNA) pipeline]]></title>
	<description><![CDATA[<p>For a detailed description of the pipeline and how it integrates with other tools designed by the Aiden Lab see&nbsp;<a href="http://aidenlab.org/assembly/manual_180322.pdf">Genome Assembly Cookbook</a>&nbsp;on&nbsp;<a href="http://aidenlab.org/assembly">http://aidenlab.org/assembly</a>.</p>
<p>For the original version of the pipeline and to reproduce the Hs2-HiC and the AaegL4 genomes reported in&nbsp;<a href="http://science.sciencemag.org/content/356/6333/92">(Dudchenko et al.,&nbsp;<em>Science</em>, 2017)</a>&nbsp;see the&nbsp;<a href="https://github.com/theaidenlab/3d-dna/tree/745779bdf64db6e55bddb70c24e9b58825938c33">original commit</a>.</p>
<p>For the detailed description of the merge section see&nbsp;<a href="https://github.com/theaidenlab/AGWG-merge">https://github.com/theaidenlab/AGWG-merge</a>.</p><p>Address of the bookmark: <a href="https://github.com/theaidenlab/3d-dna" rel="nofollow">https://github.com/theaidenlab/3d-dna</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41734/supernova-generates-phased-whole-genome-de-novo-assemblies-from-a-chromium-prepared-library</guid>
	<pubDate>Sun, 31 May 2020 01:59:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41734/supernova-generates-phased-whole-genome-de-novo-assemblies-from-a-chromium-prepared-library</link>
	<title><![CDATA[Supernova: generates phased, whole-genome de novo assemblies from a Chromium-prepared library.]]></title>
	<description><![CDATA[<p>Supernova generates phased, whole-genome&nbsp;<em>de novo</em>&nbsp;assemblies from a Chromium-prepared library.</p>
<p>Please see&nbsp;<a href="https://support.10xgenomics.com/de-novo-assembly/guidance/doc/achieving-success-with-de-novo-assembly">Achieving Success with De Novo Assembly</a>&nbsp;and&nbsp;<a href="https://support.10xgenomics.com/de-novo-assembly/software/overview/system-requirements">System Requirements</a>&nbsp;<em>before</em>&nbsp;creating your Chromium libraries for assembly.</p>
<p>Supernova should be run using 38-56x coverage of the genome.<br>&bull; Somewhat higher coverage is&nbsp;<em>sometimes</em>&nbsp;advantageous.<br>&bull; Supernova will exit if it finds that coverage is far from the recommended range.<br>&bull; Note that at most 2.14 billion reads are allowed.<br>&bull; Please note that we have not extensively tested genomes larger than human, and any genome above approximately 4 GB should be considered experimental and is not supported.</p><p>Address of the bookmark: <a href="https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/running" rel="nofollow">https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/running</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27113/picard</guid>
	<pubDate>Fri, 29 Apr 2016 08:21:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27113/picard</link>
	<title><![CDATA[Picard]]></title>
	<description><![CDATA[<p>Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the <a href="http://samtools.github.io/hts-specs/">Hts-specs</a> repository. See especially the <a href="http://samtools.github.io/hts-specs/SAMv1.pdf">SAM specification</a> and the <a href="http://samtools.github.io/hts-specs/VCFv4.3.pdf">VCF specification</a>.</p>
<p>Note that the information on this page is targeted at end-users. For developers, the source code, building instructions and implementation/development resources are available on <a href="https://github.com/broadinstitute/picard">GitHub</a>.</p>
<p>The Picard toolkit is open-source under the <a href="https://tldrlegal.com/license/mit-license">MIT license</a> and free for all uses.</p>
<p>Enjoy!</p><p>Address of the bookmark: <a href="http://broadinstitute.github.io/picard/" rel="nofollow">http://broadinstitute.github.io/picard/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</guid>
	<pubDate>Tue, 26 Apr 2016 03:38:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</link>
	<title><![CDATA[ALE: a Generic Assembly Likelihood Evaluation Framework for Assessing the Accuracy of Genome and Metagenome Assemblies]]></title>
	<description><![CDATA[<p>Assembly Likelihood Evaluation (ALE) framework that overcomes these limitations, systematically evaluating the accuracy of an assembly in a reference-independent manner using rigorous statistical methods. This framework is comprehensive, and integrates read quality, mate pair orientation and insert length (for paired-end reads), sequencing coverage, read alignment and k-mer frequency. ALE pinpoints synthetic errors in both single and metagenomic assemblies, including single-base errors, insertions/deletions, genome rearrangements and chimeric assemblies presented in metagenomes. At the genome level with real-world data, ALE identifies three large misassemblies from the Spirochaeta smaragdinae finished genome, which were all independently validated by Pacific Biosciences sequencing. At the single-base level with Illumina data, ALE recovers 215 of 222 (97%) single nucleotide variants in a training set from a GC-rich Rhodobacter sphaeroides genome. Using real Pacific Biosciences data, ALE identifies 12 of 12 synthetic errors in a Lambda Phage genome, surpassing even Pacific Biosciences' own variant caller, EviCons. In summary, the ALE framework provides a comprehensive, reference-independent and statistically rigorous measure of single genome and metagenome assembly accuracy, which can be used to identify misassemblies or to optimize the assembly process.</p>
<p>More at&nbsp;http://www.ncbi.nlm.nih.gov/pubmed/23303509</p><p>Address of the bookmark: <a href="http://sc932.github.io/ALE/about.html" rel="nofollow">http://sc932.github.io/ALE/about.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27438/hagfish-assess-an-assembly-through-creative-use-of-coverage-plots</guid>
	<pubDate>Fri, 20 May 2016 19:08:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27438/hagfish-assess-an-assembly-through-creative-use-of-coverage-plots</link>
	<title><![CDATA[Hagfish - assess an assembly through creative use of coverage plots]]></title>
	<description><![CDATA[<p>Hagfish is a tool that is to be used in data analysis of Next Generation Sequencing (NGS) experiments. Hagfish builds on the concept of coverage plots and aims to assist (amongst others) in quality control of&nbsp;<em style="font-size: 12.8px;">de novo</em>&nbsp;genome assembly or identification of structural variation in a genome re-sequencing experiment.</p>
<p>Hagfish requires a reference sequence and a&nbsp;<span>paired end</span>&nbsp;re-sequencing data set. Hagfish has more power the larger the insert size of the paired end library is.</p>
<p>Quick links:&nbsp;<a href="https://github.com/mfiers/hagfish/wiki/Install">Installation</a>,<a href="https://github.com/mfiers/hagfish/wiki/Operation">Operation</a>,&nbsp;<a href="https://github.com/mfiers/hagfish/wiki/ReadMappers">Read mappers</a>,&nbsp;<a href="https://github.com/mfiers/hagfish/wiki/Scripts">Hagfish scripts</a>,&nbsp;<a href="https://github.com/mfiers/hagfish/wiki/Plots">Hagfish plots</a></p><p>Address of the bookmark: <a href="https://github.com/mfiers/hagfish" rel="nofollow">https://github.com/mfiers/hagfish</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31139/pbsuite-software-for-long-read-sequencing-data-from-pacbio</guid>
	<pubDate>Mon, 27 Feb 2017 09:54:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31139/pbsuite-software-for-long-read-sequencing-data-from-pacbio</link>
	<title><![CDATA[PBSuite: Software for Long-Read Sequencing Data from PacBio]]></title>
	<description><![CDATA[<p><span>PBJelly - the genome upgrading tool.&nbsp;</span><br><span>PBHoney - the structural variation discovery tool&nbsp;</span><br><br><span>Both are contained within the PBSuite code found in downloads.</span><br><br><span>----- PBJelly -----</span><br><span>Read The Paper&nbsp;</span><br><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047768" target="_blank">http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047768</a><br><br><span>PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes.&nbsp;</span><br><br><span>----- PBHoney -----</span><br><span>Read The Paper</span><br><a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a><br><br><span>PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/pb-jelly/" rel="nofollow">https://sourceforge.net/projects/pb-jelly/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</guid>
	<pubDate>Mon, 15 May 2017 05:04:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</link>
	<title><![CDATA[CABOG: Celera Assembler with Best Overlap Graph]]></title>
	<description><![CDATA[<p>CABOG (Celera Assembler with Best Overlap Graph) is scientific software for&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/24/24/2818.abstract">DNA research</a>. CABOG has been a critical component of many genome sequencing projects. CABOG operates on small genomes such as bacterial as well as large genomes such as mammalian. CABOG is an extension of the Celera Assembler software that was originally developed at&nbsp;<a href="http://www.celera.com/">Celera</a>&nbsp;for the 2001 publication of the first draft human genome sequence. The software was released to the public domain in 2004. Its open source&nbsp;<a href="http://wgs-assembler.sf.net/">repository</a>&nbsp;on Source Forge is an internet resource for scientists around the world.&nbsp;</p>
<p>CABOG is one of many software programs called genome assemblers. These programs exist to overcome the fundamental limitation of all sequencing machines, namely, that they read out very few DNA letters at a time. These programs reconstruct genomes that are billions of letters long from the hundreds of letters per read that modern sequencers provide. What these programs do is often described as a scaled up version of a family solving a jigsaw puzzle.</p>
<p>The CABOG software was the first to accomplish many scientific goals. It was the first to assemble the genome of a multicellular organism (<em>Drosophila melanogaster</em>, 2000). It was the first to assemble both parental haplotypes of one human genome (J. Craig Venter, 2007). It was the first to assemble environmental sequence from the oceans (Sargasso Sea in 2004 and Global Ocean Sampling in 2007). It was first to combine reads from first-generation Sanger sequencing machines and second-generation pyrosequencing machines (Marine microbes, 2006). Today, CABOG is one of the leading assembly programs for data sets that include paired end data from the Roche 454 line of sequencing machines.</p><p>Address of the bookmark: <a href="http://www.jcvi.org/cms/research/projects/cabog/overview/" rel="nofollow">http://www.jcvi.org/cms/research/projects/cabog/overview/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32905/bigmac-breaking-inaccurate-genomes-and-merging-assembled-contigs-for-long-read-metagenomic-assembly</guid>
	<pubDate>Mon, 22 May 2017 05:43:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32905/bigmac-breaking-inaccurate-genomes-and-merging-assembled-contigs-for-long-read-metagenomic-assembly</link>
	<title><![CDATA[BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly]]></title>
	<description><![CDATA[<p>This tool is for users to upgrade their metagenomics assemblies using long reads. This includes fixing mis-assemblies and scaffolding/gap-filling. If you encounter any issues, please contact me at&nbsp;<a href="mailto:kklam@eecs.berkeley.edu">kklam@eecs.berkeley.edu</a>. My name is Ka-Kit Lam.</p>
<p>https://github.com/kakitone/MetaFinisherSC</p>
<p>https://github.com/kakitone/BIGMAC</p><p>Address of the bookmark: <a href="https://github.com/kakitone/BIGMAC" rel="nofollow">https://github.com/kakitone/BIGMAC</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>