<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37409?offset=110</link>
	<atom:link href="https://bioinformaticsonline.com/related/37409?offset=110" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</guid>
	<pubDate>Tue, 12 Dec 2017 17:23:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</link>
	<title><![CDATA[MashMap: a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s)]]></title>
	<description><![CDATA[<p><span>MashMap is a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s). It maps a query sequence against a reference region if and only if its estimated alignment identity is above a specified threshold. It does not compute the alignments explicitly, but rather estimates a&nbsp;</span><em>k</em><span>-mer based&nbsp;</span><a href="https://en.wikipedia.org/wiki/Jaccard_index">Jaccard similarity</a><span>&nbsp;using a combination of&nbsp;</span><a href="http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p76-schleimer.pdf">Winnowing</a><span>&nbsp;and&nbsp;</span><a href="https://en.wikipedia.org/wiki/MinHash">MinHash</a><span>. This is then converted to an estimate of sequence identity using the&nbsp;</span><a href="http://mash.readthedocs.org/">Mash</a><span>&nbsp;distance. An appropriate&nbsp;</span><em>k</em><span>-mer sampling rate is automatically determined given minimum local alignment length and identity thresholds. The efficiency of the algorithm improves as both of these thresholds are increased.</span></p><p>Address of the bookmark: <a href="https://github.com/marbl/MashMap" rel="nofollow">https://github.com/marbl/MashMap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</guid>
	<pubDate>Fri, 11 May 2018 05:07:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36533/mecat-fast-mapping-error-correction-and-de-novo-assembly-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads]]></title>
	<description><![CDATA[<p>MECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads. MECAT employs novel alignment and error correction algorithms that are much more efficient than the state of art of aligners and error correction tools. MECAT can be used for effectively de novo assemblying large genomes. For example, on a 32-thread computer with 2.0 GHz CPU , MECAT takes 9.5 days to assemble a human genome based on 54x SMRT data, which is 40 times faster than the current&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>. MECAT performance were compared with&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>,&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>&nbsp;and&nbsp;<a href="http://canu.readthedocs.io/en/latest/">Canu(v1.3)</a>&nbsp;in five real datasets. The quality of assembled contigs produced by MECAT is the same or better than that of the&nbsp;<a href="http://cbcb.umd.edu/software/pbcr/mhap/">PBcR-Mhap pipeline</a>&nbsp;and&nbsp;<a href="https://github.com/PacificBiosciences/falcon">FALCON</a>.&nbsp;</p>
<p>https://www.nature.com/articles/nmeth.4432</p><p>Address of the bookmark: <a href="https://github.com/xiaochuanle/MECAT" rel="nofollow">https://github.com/xiaochuanle/MECAT</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</guid>
	<pubDate>Tue, 05 Jun 2018 10:10:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</link>
	<title><![CDATA[Cerulean: A hybrid assembly using high throughput short and long reads]]></title>
	<description><![CDATA[Cerulean extends contigs assembled using short read datasets like Illumina paired-end reads using long reads like PacBio RS long reads.

Cerulean v0.1 has been implemented with bacterial genomes in mind.

The method is fully described in Deshpande, V., Fung, E. D., Pham, S., &amp; Bafna, V. (2013). Cerulean: A hybrid assembly using high throughput short and long reads. arXiv preprint arXiv:1307.7933.
http://arxiv.org/abs/1307.7933<p>Address of the bookmark: <a href="https://sourceforge.net/projects/ceruleanassembler/" rel="nofollow">https://sourceforge.net/projects/ceruleanassembler/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37223/chopstitch-exon-annotation-and-splice-graph-construction-using-transcriptome-assembly-and-whole-genome-sequencing-data</guid>
	<pubDate>Tue, 03 Jul 2018 04:14:52 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37223/chopstitch-exon-annotation-and-splice-graph-construction-using-transcriptome-assembly-and-whole-genome-sequencing-data</link>
	<title><![CDATA[ChopStitch: exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data]]></title>
	<description><![CDATA[ChopStitch is a new method for finding putative exons and constructing splice graphs using an assembled transcriptome and whole genome shotgun sequencing (WGSS) data. ChopStitch identifies exon-exon boundaries in de novo assembled RNA-seq data with the help of a Bloom filter that represents the k-mer spectrum of WGSS reads. The algorithm also detects base substitutions in transcript sequences corresponding to sequencing or assembly errors, haplotype variations, or putative RNA editing events. The primary output of our tool is a FASTA file containing putative exons. Further, exon edges are interrogated for alternative exon-exon boundaries to detect transcript isoforms, which are reported as splice graphs in dot output format.<p>Address of the bookmark: <a href="https://github.com/bcgsc/ChopStitch" rel="nofollow">https://github.com/bcgsc/ChopStitch</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37785/haplomerger2-rebuilding-both-haploid-sub-assemblies-from-high-heterozygosity-diploid-genome-assembly</guid>
	<pubDate>Thu, 27 Sep 2018 07:08:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37785/haplomerger2-rebuilding-both-haploid-sub-assemblies-from-high-heterozygosity-diploid-genome-assembly</link>
	<title><![CDATA[HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly]]></title>
	<description><![CDATA[<p><span><span>HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (&ge;3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a hierarchical scaffolding procedure and a reliable gap-closing method for haploid sub-assemblies.</span></span></p>
<p><span>Source code, executables and the testing dataset are freely available at&nbsp;</span><a href="https://github.com/mapleforest/HaploMerger2/releases/" target="">https://github.com/mapleforest/HaploMerger2/releases/</a><span>.</span></p><p>Address of the bookmark: <a href="https://github.com/mapleforest/HaploMerger2/releases/" rel="nofollow">https://github.com/mapleforest/HaploMerger2/releases/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/38886/evaluation-of-genome-assembly-software-based-on-long-reads</guid>
	<pubDate>Fri, 01 Feb 2019 11:55:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/38886/evaluation-of-genome-assembly-software-based-on-long-reads</link>
	<title><![CDATA[Evaluation of genome assembly software based on long reads]]></title>
	<description><![CDATA[<p>TGS technologies have been used to produce highly accurate de novo assemblies of hundreds of microbial genomes and highly contiguous reconstructions of many dozens of plant and animal genomes, enabling new insights into evolution and sequence diversity. They have also been applied to resequencing analyses, to create detailed maps of structural variations in many species. Also, these new technologies have been used to fill in many of the gaps in the human reference genome.</p><p>In this report, we compare and evaluate several genome assembly software based on TSG technology. The experimentation has been performed on 4 reference genomes and the results evaluated with the QUAST software. The 11 software that have been evaluated are: Celera Assembler , Falcon , Miniasm, Newbler , SGA Assembler, Smartdenovo, Abruijn, Ra, DBG2OLC, Spades and Cerulean. The first 8 software use only long reads, while the 3 last software can merge long and short reads</p>]]></description>
	<dc:creator>BioStar</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/38886" length="382699" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39253/gmass-a-novel-measure-for-genomeassembly-structural-similarity</guid>
	<pubDate>Sun, 14 Apr 2019 20:35:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39253/gmass-a-novel-measure-for-genomeassembly-structural-similarity</link>
	<title><![CDATA[GMASS: a novel measure for genomeassembly structural similarity]]></title>
	<description><![CDATA[<div id="Abstract">
<div id="ASec3">
<p id="Par3">The GMASS score is a novel measure for representing structural similarity between two assemblies. It will contribute to the understanding of assembly output and developing de novo assemblers.</p>
<p><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2710-z">https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2710-z</a></p>
</div>
</div><p>Address of the bookmark: <a href="http://bioinfo.konkuk.ac.kr/GMASS/htdocs/syncircos.php" rel="nofollow">http://bioinfo.konkuk.ac.kr/GMASS/htdocs/syncircos.php</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41592/refka-a-fast-and-efficient-long-read-genome-assembly-approach-for-large-and-complex-genomes</guid>
	<pubDate>Fri, 01 May 2020 03:00:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41592/refka-a-fast-and-efficient-long-read-genome-assembly-approach-for-large-and-complex-genomes</link>
	<title><![CDATA[RefKA: A fast and efficient long-read genome assembly approach for large and complex genomes]]></title>
	<description><![CDATA[<p><span>RefKA, a reference-based approach for long read genome assembly. This approach relies on breaking up a closely related reference genome into bins, aligning k-mers unique to each bin with PacBio reads, and then assembling each bin in parallel followed by a final bin-stitching step.</span></p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/AppliedBioinformatics/RefKA" rel="nofollow">https://github.com/AppliedBioinformatics/RefKA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26322/liftover</guid>
	<pubDate>Mon, 08 Feb 2016 15:45:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26322/liftover</link>
	<title><![CDATA[liftover]]></title>
	<description><![CDATA[<p><span>Convenient conversions between genome assemblie.&nbsp;The liftover package makes it easy to remap genomic coordinates to a different genome assembly. </span></p>
<p><span>More at https://github.com/aaronwolen/liftover<br></span></p>
<p><span>https://www.bioconductor.org/help/workflows/liftOver/</span></p><p>Address of the bookmark: <a href="https://github.com/aaronwolen/liftover" rel="nofollow">https://github.com/aaronwolen/liftover</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</guid>
	<pubDate>Wed, 23 Mar 2016 05:53:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</link>
	<title><![CDATA[RNA-Seq De novo Assembly Using Trinity]]></title>
	<description><![CDATA[<p>Trinity, developed at the <a href="http://www.broadinstitute.org">Broad Institute</a> and the <a href="http://www.cs.huji.ac.il">Hebrew University of Jerusalem</a>, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:</p>
<ul>
<li>
<p><em>Inchworm</em> assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.</p>
</li>
<li>
<p><em>Chrysalis</em> clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.</p>
</li>
<li>
<p><em>Butterfly</em> then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.</p>
</li>
</ul>
<p>More at https://github.com/trinityrnaseq/trinityrnaseq/wiki</p>
<p>......................................................................................................................................</p>
<p>Download Trinity <a href="https://github.com/trinityrnaseq/trinityrnaseq/releases">here</a>.</p>
<p>Build Trinity by typing 'make' in the base installation directory.</p>
<p>Assemble RNA-Seq data like so:</p>
<pre><code> Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G 
</code></pre>
<p>Find assembled transcripts as: 'trinity_out_dir/Trinity.fasta'</p><p>Address of the bookmark: <a href="https://github.com/trinityrnaseq/trinityrnaseq/wiki" rel="nofollow">https://github.com/trinityrnaseq/trinityrnaseq/wiki</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>

</channel>
</rss>