<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/30242?offset=120</link>
	<atom:link href="https://bioinformaticsonline.com/related/30242?offset=120" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36935/assemblytics-delta-file-to-analyze-alignments-of-an-assembly-to-another-assembly-or-a-reference-genome</guid>
	<pubDate>Thu, 14 Jun 2018 07:31:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36935/assemblytics-delta-file-to-analyze-alignments-of-an-assembly-to-another-assembly-or-a-reference-genome</link>
	<title><![CDATA[assemblytics: delta file to analyze alignments of an assembly to another assembly or a reference genome]]></title>
	<description><![CDATA[Download and install MUMmer
Align your assembly to a reference genome using nucmer (from MUMmer package)
$ nucmer -maxmatch -l 100 -c 500 REFERENCE.fa ASSEMBLY.fa -prefix OUT
Consult the MUMmer manual if you encounter problems

Optional: Gzip the delta file to speed up upload (usually 2-4X faster)
$ gzip OUT.delta
Then use the OUT.delta.gz file for upload.
Upload the .delta or delta.gz file (view example) to Assemblytics
Important: Use only contigs rather than scaffolds from the assembly. This will prevent false positives when the number of Ns in the scaffolded sequence does not match perfectly to the distance in the reference.

The unique sequence length required represents an anchor for determining if a sequence is unique enough to safely call variants from, which is an alternative to the mapping quality filter for read alignment.

http://assemblytics.com/<p>Address of the bookmark: <a href="http://assemblytics.com/" rel="nofollow">http://assemblytics.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29004/r-chie</guid>
	<pubDate>Thu, 01 Sep 2016 11:47:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29004/r-chie</link>
	<title><![CDATA[R-chie]]></title>
	<description><![CDATA[<p><strong>R-chie</strong><span>&nbsp;allows you to make arc diagrams of RNA secondary structures, allowing for easy comparison and overlap of two structures, rank and display basepairs in colour and to also visualize corresponding multiple sequence alignments and co-variation information.</span><br><strong>R4RNA</strong><span>&nbsp;is the R package powering R-chie, available for&nbsp;</span><a href="http://www.e-rna.org/r-chie/download.cgi">download</a><span>&nbsp;and local use for more customized figures and scripting.</span></p>
<p>http://www.e-rna.org/r-chie/plot.cgi?eg=single</p><p>Address of the bookmark: <a href="http://www.e-rna.org/r-chie/plot.cgi?eg=single" rel="nofollow">http://www.e-rna.org/r-chie/plot.cgi?eg=single</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/29108/assembly-tutorial-ppt</guid>
	<pubDate>Wed, 07 Sep 2016 03:12:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/29108/assembly-tutorial-ppt</link>
	<title><![CDATA[Assembly tutorial PPT]]></title>
	<description><![CDATA[<p>Saved Cornell University assembly workshop PPT.</p><p>Reference:&nbsp;</p><p>http://cbsu.tc.cornell.edu/lab/doc/assembly_workshop_20150420_lecture1.pdf</p>]]></description>
	<dc:creator>Jit</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/29108" length="1617402" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38063/referee-genome-assembly-quality-scores</guid>
	<pubDate>Sun, 04 Nov 2018 16:44:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38063/referee-genome-assembly-quality-scores</link>
	<title><![CDATA[Referee: Genome assembly quality scores]]></title>
	<description><![CDATA[<p>Modern genome sequencing technologies provide a succint measure of quality at each position in every read, however all of this information is lost in the assembly process. Referee summarizes the quality information from the reads that map to a site in an assembled genome to calculate a quality score for each position in the genome assembly.</p>
<p>We accomplish this by first calculating genotype likelihoods for every site. For a given site in a diploid genome, there are 10 possible genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT). Referee takes as input the genotype likelihoods calculated for all 10 genotypes given the called reference base at each position.</p>
<h3>Referee is a program to calculate a quality score for every position in a genome assembly. This allows for easy filtering of low quality sites for any downstream analysis.</h3>
<p>https://github.com/gwct/referee</p><p>Address of the bookmark: <a href="https://gwct.github.io/referee/#" rel="nofollow">https://gwct.github.io/referee/#</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29144/fermi</guid>
	<pubDate>Fri, 09 Sep 2016 05:37:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29144/fermi</link>
	<title><![CDATA[FERMI]]></title>
	<description><![CDATA[<p><span>Fermi is a de novo assembler with a particular focus on assembling Illumina&nbsp;</span><span>short sequence reads from a mammal-sized genome. In addition to the role of a&nbsp;</span><span>typical assembler, fermi also aims to preserve heterozygotes which are often&nbsp;</span><span>collapsed by other assemblers. Its ultimate goal is to find a minimal set of</span><br><span>unitigs to represent all the information in raw reads.</span><br><br><span>Fermi follows the overlap-layout-consensus paradigm and uses the FM-DNA-index&nbsp;</span><span>(FMD-index) as the key data structure. It is inspired by the string graph&nbsp;</span><span>assembler (Simpson and Durbin, 2010 and 2012) and has a similar workflow.</span><br><br><span>As a typical de novo assembler, fermi tends to produce contigs with slightly&nbsp;</span><span>longer N50. However, the major weakness of fermi is the high misassembly rate.&nbsp;</span><span>Although fermi provides a tool to fix misassemblies by using paired-end reads&nbsp;</span><span>to achieve an accuracy comparable to other assemblers, this is not a favorable&nbsp;</span><span>solution.</span><br><br><span>Fermi is designed to be used on a multi-core Linux machine with large shared&nbsp;</span><span>memory. The easiest way to run fermi is to use the run-fermi.pl script. It&nbsp;</span><span>generates a Makefile. The actual assembly is done by invoking make. Premature&nbsp;</span><span>assembly processes can be resumed. Here is an example:</span><br><br><span>run-fermi.pl -dAPe ./fermi -p NA12878 -t16 -f18 reads*.fq.gz &gt; NA12878.mak</span><br><span>make -f NA12878.mak -j16</span></p><p>Address of the bookmark: <a href="https://github.com/lh3/fermi" rel="nofollow">https://github.com/lh3/fermi</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29500/genomescope-open-source-web-tool-to-rapidly-estimate-the-overall-characteristics-of-a-genome-including-genome-size-heterozygosity-rate-and-repeat-content-from-unprocessed-short-reads</guid>
	<pubDate>Fri, 21 Oct 2016 05:46:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29500/genomescope-open-source-web-tool-to-rapidly-estimate-the-overall-characteristics-of-a-genome-including-genome-size-heterozygosity-rate-and-repeat-content-from-unprocessed-short-reads</link>
	<title><![CDATA[GenomeScope: open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate, and repeat content from unprocessed short reads]]></title>
	<description><![CDATA[<div>
<div>
<div>
<div id="content-block-markup">
<div>
<div id="abstract-1">
<p id="p-2">Summary: GenomeScope is an open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate, and repeat content from unprocessed short reads. These features are essential for studying genome evolution, and help to choose parameters for downstream analysis. We demonstrate its accuracy on 324 simulated and 16 real datasets with a wide range in genome sizes, heterozygosity levels, and error rates. Availability and Implementation: http://qb.cshl.edu/genomescope/, https://github.com/schatzlab/genomescope.git</p>
</div>
<span></span></div>
<span></span></div>
</div>
</div>
</div><p>Address of the bookmark: <a href="http://qb.cshl.edu/genomescope/" rel="nofollow">http://qb.cshl.edu/genomescope/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</guid>
	<pubDate>Tue, 22 Nov 2016 04:51:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</link>
	<title><![CDATA[Maq: Mapping and Assembly with Quality]]></title>
	<description><![CDATA[<p><strong>Maq</strong>&nbsp;stands for&nbsp;<em>Mapping and Assembly with Quality</em>&nbsp;It builds assembly by mapping short reads to reference sequences. Maq is a project hosted by&nbsp;<a href="http://sourceforge.net/">SourceForge.net</a>. The project page is available at<a href="http://sourceforge.net/projects/maq/">http://sourceforge.net/projects/maq/</a>. Maq is previously known as mapass2.</p>
<h2>Run Maq Now</h2>
<p>Follow these steps to try Maq. All you need is a reference sequence file in the FASTA format.</p>
<ol>
<li>Prepare a reference sequence (ref.fasta). Better a bacterial genome.</li>
<li>Download maq, maq-data and maqview at the&nbsp;<a href="http://sourceforge.net/project/showfiles.php?group_id=191815">download page</a>.</li>
<li>Copy maq, maq.pl and maq_eval.pl to the $PATH or to the same directory.</li>
<li>Simulate diploid reference and read sequences, map reads, call variants and evaluate the results in one go:
<pre>maq.pl demo ref.fasta calib-30.dat
</pre>
where&nbsp;<em>calib-30.dat</em>&nbsp;is contained in maq-data.</li>
<li>View the alignment:
<pre>cd maqdemo/easyrun;
maqindex -i -c consensus.cns all.map;
maqview -c consensus.cns all.map</pre>
</li>
</ol>
<p><strong>Even for advanced maq users, running `maq.pl demo' is recommended. You may find something helpful.</strong></p><p>Address of the bookmark: <a href="http://maq.sourceforge.net" rel="nofollow">http://maq.sourceforge.net</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26322/liftover</guid>
	<pubDate>Mon, 08 Feb 2016 15:45:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26322/liftover</link>
	<title><![CDATA[liftover]]></title>
	<description><![CDATA[<p><span>Convenient conversions between genome assemblie.&nbsp;The liftover package makes it easy to remap genomic coordinates to a different genome assembly. </span></p>
<p><span>More at https://github.com/aaronwolen/liftover<br></span></p>
<p><span>https://www.bioconductor.org/help/workflows/liftOver/</span></p><p>Address of the bookmark: <a href="https://github.com/aaronwolen/liftover" rel="nofollow">https://github.com/aaronwolen/liftover</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</guid>
	<pubDate>Wed, 23 Mar 2016 05:53:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</link>
	<title><![CDATA[RNA-Seq De novo Assembly Using Trinity]]></title>
	<description><![CDATA[<p>Trinity, developed at the <a href="http://www.broadinstitute.org">Broad Institute</a> and the <a href="http://www.cs.huji.ac.il">Hebrew University of Jerusalem</a>, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:</p>
<ul>
<li>
<p><em>Inchworm</em> assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.</p>
</li>
<li>
<p><em>Chrysalis</em> clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.</p>
</li>
<li>
<p><em>Butterfly</em> then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.</p>
</li>
</ul>
<p>More at https://github.com/trinityrnaseq/trinityrnaseq/wiki</p>
<p>......................................................................................................................................</p>
<p>Download Trinity <a href="https://github.com/trinityrnaseq/trinityrnaseq/releases">here</a>.</p>
<p>Build Trinity by typing 'make' in the base installation directory.</p>
<p>Assemble RNA-Seq data like so:</p>
<pre><code> Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G 
</code></pre>
<p>Find assembled transcripts as: 'trinity_out_dir/Trinity.fasta'</p><p>Address of the bookmark: <a href="https://github.com/trinityrnaseq/trinityrnaseq/wiki" rel="nofollow">https://github.com/trinityrnaseq/trinityrnaseq/wiki</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</guid>
	<pubDate>Mon, 19 Dec 2016 14:20:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</link>
	<title><![CDATA[pyScaf]]></title>
	<description><![CDATA[<p>pyScaf orders contigs from genome assemblies utilising several types of information:</p>
<ul>
<li>paired-end (PE) and/or mate-pair libraries (<a href="https://github.com/lpryszcz/pyScaf#ngs-based-scaffolding">NGS-based mode</a>)</li>
<li>long reads (<a href="https://github.com/lpryszcz/pyScaf#scaffolding-based-on-long-reads">NGS-based mode</a>)</li>
<li>synteny to the genome of some related species (<a href="https://github.com/lpryszcz/pyScaf#reference-based-scaffolding">reference-based mode</a>)</li>
</ul>
<p>Scaffolding&nbsp;</p>
<p>In reference-based mode, pyScaf uses synteny to the genome of closely related species in order to order contigs and estimate distances between adjacent contigs.</p>
<p>Contigs are aligned globally (end-to-end) onto reference chromosomes, ignoring:</p>
<ul>
<li>matches not satisfying cut-offs (<code>--identity</code>&nbsp;and&nbsp;<code>--overlap</code>)</li>
<li>suboptimal matches (only best match of each query to reference is kept)</li>
<li>and removing overlapping matches on reference.</li>
</ul>
<p>In preliminary tests, pyScaf performed superbly on simulated heterozygous genomes based on&nbsp;<em>C. parapsilosis</em>&nbsp;(13 Mb; CANPA) and&nbsp;<em>A. thaliana</em>&nbsp;(119 Mb; ARATH) chromosomes, reconstructing correctly all chromosomes always for CANPA and nearly always for ARATH (<a href="https://www.dropbox.com/sh/bb7lwggo40xrwtc/AAAZ7pByVQQQ-WhUXZVeJaZVa/pyScaf?dl=0">Figures in dropbox</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=2036953672">CANPA table</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=1920757821">ARATH table</a>).<br>Runs took ~0.5 min for CANPA on&nbsp;<code>4 CPUs</code>&nbsp;and ~2 min for ARATH on&nbsp;<code>16 CPUs</code>.</p>
<p><span>Important remarks:</span></p>
<ul>
<li>Reduce your assembly before (fasta2homozygous.py) as any redundancy will likely break the synteny.</li>
<li>pyScaf works better with contigs than scaffolds, as scaffolds are often affected by mis-assemblies (no&nbsp;<em>de novo assembler</em>&nbsp;/ scaffolder is perfect...), which breaks synteny.</li>
<li>pyScaf works very well if divergence between reference genome and assembled contigs is below 20% at nucleotide level.</li>
<li>pyScaf deals with large rearrangements ie. deletions, insertion, inversions, translocations.&nbsp;<span>Note however, this is experimental implementation!</span></li>
<li>Consider closing gaps after scaffolding.</li>
</ul><p>Address of the bookmark: <a href="https://github.com/lpryszcz/pyScaf" rel="nofollow">https://github.com/lpryszcz/pyScaf</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>

</channel>
</rss>