<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36516?offset=300</link>
	<atom:link href="https://bioinformaticsonline.com/related/36516?offset=300" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39671/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</guid>
	<pubDate>Sat, 06 Jul 2019 03:48:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39671/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[Flye: Fast and accurate de novo assembler for single molecule sequencing reads]]></title>
	<description><![CDATA[<p><span>Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PB / ONT reads as input and outputs polished contigs. Flye also includes a special mode for metagenome assembly.</span></p><p>Address of the bookmark: <a href="https://github.com/fenderglass/Flye" rel="nofollow">https://github.com/fenderglass/Flye</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41734/supernova-generates-phased-whole-genome-de-novo-assemblies-from-a-chromium-prepared-library</guid>
	<pubDate>Sun, 31 May 2020 01:59:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41734/supernova-generates-phased-whole-genome-de-novo-assemblies-from-a-chromium-prepared-library</link>
	<title><![CDATA[Supernova: generates phased, whole-genome de novo assemblies from a Chromium-prepared library.]]></title>
	<description><![CDATA[<p>Supernova generates phased, whole-genome&nbsp;<em>de novo</em>&nbsp;assemblies from a Chromium-prepared library.</p>
<p>Please see&nbsp;<a href="https://support.10xgenomics.com/de-novo-assembly/guidance/doc/achieving-success-with-de-novo-assembly">Achieving Success with De Novo Assembly</a>&nbsp;and&nbsp;<a href="https://support.10xgenomics.com/de-novo-assembly/software/overview/system-requirements">System Requirements</a>&nbsp;<em>before</em>&nbsp;creating your Chromium libraries for assembly.</p>
<p>Supernova should be run using 38-56x coverage of the genome.<br>&bull; Somewhat higher coverage is&nbsp;<em>sometimes</em>&nbsp;advantageous.<br>&bull; Supernova will exit if it finds that coverage is far from the recommended range.<br>&bull; Note that at most 2.14 billion reads are allowed.<br>&bull; Please note that we have not extensively tested genomes larger than human, and any genome above approximately 4 GB should be considered experimental and is not supported.</p><p>Address of the bookmark: <a href="https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/running" rel="nofollow">https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/running</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</guid>
	<pubDate>Tue, 05 Jun 2018 09:57:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</link>
	<title><![CDATA[PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach]]></title>
	<description><![CDATA[PERGA - Paired End Reads Guided Assembler

PERGA is a novel sequence reads guided de novo assembly approach which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap sizes from O ≥ Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. PERGA will try to extend the contigs by all feasible nucleotides and determine if these multiple extensions due to sequencing errors or repeats by using looking ahead technology, and it also try to separate the different repeats of nearby genomic regions to make the assembly result more longer and accurate.

The simulated E.coli paired-end reads data are generated using GemSim (KE McElroy, F Luciani, T Thomas. Gemsim: General, Error-Model Based Simulator of Next-Generation Sequencing Data. BMC Genomics 2012, 13:74), with coverage 50x, 60x, 100x, read lengths 100-bp, and can be downloaded from https://github.com/zhuxiao/data_PERGA.<p>Address of the bookmark: <a href="https://github.com/hitbio/PERGA" rel="nofollow">https://github.com/hitbio/PERGA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</guid>
	<pubDate>Tue, 17 Sep 2013 16:48:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</link>
	<title><![CDATA[Tigers genome sequenced]]></title>
	<description><![CDATA[<p>Fifteen scientists led by Dr Jong Bhak of Genome Research Foundation, South Korea, decoded as many as 3 billion nucleotides (organic molecules that form the basic building blocks of nucleic acids, such as DNA). They identified 20,000 genes related to various functions of the tiger.&nbsp;</p><p>The biggest and perhaps most fearsome of the world's big cats, the tiger, shares 95.6 percent of its DNA with humans' cute and furry companions, domestic cats.</p><p>The new research showed that big cats have genetic mutations that enabled them to be carnivores. The team also identified mutations that allow snow leopards to thrive at high altitudes.</p><p>Reference:</p><p><a href="http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690">http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690</a></p><p><a href="http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms">http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms</a></p><p>Paper:</p><p><a href="http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html">http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34443/opera-an-optimal-genome-scaffolding-program</guid>
	<pubDate>Mon, 27 Nov 2017 10:18:20 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34443/opera-an-optimal-genome-scaffolding-program</link>
	<title><![CDATA[Opera: An optimal genome scaffolding program]]></title>
	<description><![CDATA[<p><span>Opera (Optimal Paired-End Read Assembler) is a sequence assembly program (</span><a href="http://en.wikipedia.org/wiki/Sequence_assembly" target="_blank">http://en.wikipedia.org/wiki/Sequence_assembly&nbsp;<img src="https://a.fsdn.com/con/img/icons/external_asset.png" alt="image" style="border: 0px;"></a><span>). It uses information from paired-end or long reads to optimally order and orient contigs assembled from shotgun-sequencing reads.</span><br><br><span>An updated version called OPERA-LG has been re-engineered with features for the assembly of large and complex genomes.</span><br><br><span>Song Gao, Denis Bertrand, Burton K. H. Chia and Niranjan Nagarajan. OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees. Genome Biology, May 2016, doi: 10.1186/s13059-016-0951-y.</span><br><br><span>Song Gao, Wing-Kin Sung, Niranjan Nagarajan. Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. Journal of Computational Biology, Sept. 2011, doi:10.1089/cmb.2011.0170.</span></p>
<p><span>https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0951-y</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/operasf/" rel="nofollow">https://sourceforge.net/projects/operasf/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34418/spades-hybrid-genome-assembly</guid>
	<pubDate>Mon, 27 Nov 2017 08:05:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34418/spades-hybrid-genome-assembly</link>
	<title><![CDATA[SPAdes hybrid genome assembly]]></title>
	<description><![CDATA[<p>When you have both Illumina and Nanopore data, then SPAdes remains a good option for hybrid assembly - SPAdes was used to produce the&nbsp;<a href="https://gigascience.biomedcentral.com/articles/10.1186/s13742-015-0101-6">B fragilis assembly</a>&nbsp;by Mick Watson&rsquo;s group.</p><p>Again, running spades.py will show you the options:</p><div><pre><code>spades.py
</code></pre></div><p>This produces:</p><div><pre><code>SPAdes genome assembler v3.10.1

Usage: /usr/local/SPAdes-3.10.1-Linux/bin/spades.py [options] -o &lt;output_dir&gt;

Basic options:
-o      &lt;output_dir&gt;    directory to store all the resulting files (required)
--sc                    this flag is required for MDA (single-cell) data
--meta                  this flag is required for metagenomic sample data
--rna                   this flag is required for RNA-Seq data
--plasmid               runs plasmidSPAdes pipeline for plasmid detection
--iontorrent            this flag is required for IonTorrent data
--test                  runs SPAdes on toy dataset
-h/--help               prints this usage message
-v/--version            prints version

Input data:
--12    &lt;filename&gt;      file with interlaced forward and reverse paired-end reads
-1      &lt;filename&gt;      file with forward paired-end reads
-2      &lt;filename&gt;      file with reverse paired-end reads
-s      &lt;filename&gt;      file with unpaired reads
--pe&lt;#&gt;-12      &lt;filename&gt;      file with interlaced reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-1       &lt;filename&gt;      file with forward reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-2       &lt;filename&gt;      file with reverse reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-s       &lt;filename&gt;      file with unpaired reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-&lt;or&gt;    orientation of reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--s&lt;#&gt;          &lt;filename&gt;      file with unpaired reads for single reads library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-12      &lt;filename&gt;      file with interlaced reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-1       &lt;filename&gt;      file with forward reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-2       &lt;filename&gt;      file with reverse reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-s       &lt;filename&gt;      file with unpaired reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-&lt;or&gt;    orientation of reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--hqmp&lt;#&gt;-12    &lt;filename&gt;      file with interlaced reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-1     &lt;filename&gt;      file with forward reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-2     &lt;filename&gt;      file with reverse reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-s     &lt;filename&gt;      file with unpaired reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-&lt;or&gt;  orientation of reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--nxmate&lt;#&gt;-1   &lt;filename&gt;      file with forward reads for Lucigen NxMate library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--nxmate&lt;#&gt;-2   &lt;filename&gt;      file with reverse reads for Lucigen NxMate library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--sanger        &lt;filename&gt;      file with Sanger reads
--pacbio        &lt;filename&gt;      file with PacBio reads
--nanopore      &lt;filename&gt;      file with Nanopore reads
--tslr  &lt;filename&gt;      file with TSLR-contigs
--trusted-contigs       &lt;filename&gt;      file with trusted contigs
--untrusted-contigs     &lt;filename&gt;      file with untrusted contigs

Pipeline options:
--only-error-correction runs only read error correction (without assembling)
--only-assembler        runs only assembling (without read error correction)
--careful               tries to reduce number of mismatches and short indels
--continue              continue run from the last available check-point
--restart-from  &lt;cp&gt;    restart run with updated options and from the specified check-point ('ec', 'as', 'k&lt;int&gt;', 'mc')
--disable-gzip-output   forces error correction not to compress the corrected reads
--disable-rr            disables repeat resolution stage of assembling

Advanced options:
--dataset       &lt;filename&gt;      file with dataset description in YAML format
-t/--threads    &lt;int&gt;           number of threads
                                [default: 16]
-m/--memory     &lt;int&gt;           RAM limit for SPAdes in Gb (terminates if exceeded)
                                [default: 250]
--tmp-dir       &lt;dirname&gt;       directory for temporary files
                                [default: &lt;output_dir&gt;/tmp]
-k              &lt;int,int,...&gt;   comma-separated list of k-mer sizes (must be odd and
                                less than 128) [default: 'auto']
--cov-cutoff    &lt;float&gt;         coverage cutoff value (a positive float number, or 'auto', or 'off') [default: 'off']
--phred-offset  &lt;33 or 64&gt;      PHRED quality offset in the input reads (33 or 64)
                                [default: auto-detect]
</code></pre></div><p>As you can see this is also a &ldquo;pipeline&rdquo; of tools that can be switched on or off. SPAdes takes quite a long time, so for the purposes of this practical, something like this may suffice:</p><div><pre><code>spades.py -t 4 <span>\</span>
          -m 32 <span>\</span>
          -k 31,51,71 <span>\</span>
          --only-assembler <span>\</span>
          -1 miseq.1.fastq -2 miseq.2.fastq <span>\</span>
          --nanopore minion.fastq <span>\</span>
          -o hybrid_assembly
</code></pre></div><p>In turn, these parameters mean</p><ul>
<li>use 4 threads</li>
<li>max memory is 32Gb</li>
<li>use 3 kmer values to build the de bruijn graph(s) - 31, 51 and 71</li>
<li>only run the assembler, not the correction algorithm (for speed)</li>
<li>read 1 and read 2 of the MiSeq data</li>
<li>the nanopore data</li>
<li>put the output in folder &ldquo;hybrid_assembly&rdquo;</li>
</ul>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34567/jobtree-based-python-wrapper-to-run-the-genome-simulation-tool-suite-evolver</guid>
	<pubDate>Fri, 08 Dec 2017 16:26:32 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34567/jobtree-based-python-wrapper-to-run-the-genome-simulation-tool-suite-evolver</link>
	<title><![CDATA[jobTree based python wrapper to run the genome simulation tool suite Evolver]]></title>
	<description><![CDATA[<p><span>evolverSimControl</span><span>&nbsp;(</span><span>eSC</span><span>) can be used to simulate multi-chromosome genome evolution on an arbitrary phylogeny (</span><a href="http://evolution.genetics.washington.edu/phylip/newicktree.html">Newick format</a><span>). In addition to simply running evolver,&nbsp;</span><span>eSC</span><span>&nbsp;also automatically creates statistical summaries of the simulation as it runs including text and image files. Also included are convenience scripts to: check on a running simulation and see detailed status and logging information; extract fasta sequence files from the leaf nodes of a completed simulation; extract pairwise multiple alignment files (</span><a href="http://genome.ucsc.edu/FAQ/FAQformat.html#format5">.maf</a><span>) from leaf and branch nodes from a completed simulation and with the help of&nbsp;</span><a href="https://github.com/dentearl/mafTools/">mafJoin</a><span>, join them together into a single maf covering the entire simulation.</span></p><p>Address of the bookmark: <a href="https://github.com/dentearl/evolverSimControl" rel="nofollow">https://github.com/dentearl/evolverSimControl</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34620/mash-fast-genome-and-metagenome-distance-estimation-using-minhash</guid>
	<pubDate>Tue, 12 Dec 2017 17:30:12 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34620/mash-fast-genome-and-metagenome-distance-estimation-using-minhash</link>
	<title><![CDATA[Mash: fast genome and metagenome distance estimation using MinHash]]></title>
	<description><![CDATA[<p>Mash is normally distributed as a dependency-free binary for Linux or OSX (see&nbsp;<a href="https://github.com/marbl/Mash/releases">https://github.com/marbl/Mash/releases</a>). This source distribution is intended for other operating systems or for development. Mash requires c++11 to build, which is available in and GCC &gt;= 4.8 and OSX &gt;= 10.7.</p>
<p>See&nbsp;<a href="http://mash.readthedocs.org/">http://mash.readthedocs.org</a>&nbsp;for more information.</p><p>Address of the bookmark: <a href="https://github.com/marbl/Mash/releases" rel="nofollow">https://github.com/marbl/Mash/releases</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35131/giggle-a-search-engine-for-large-scale-integrated-genome-analysis</guid>
	<pubDate>Wed, 10 Jan 2018 03:10:45 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35131/giggle-a-search-engine-for-large-scale-integrated-genome-analysis</link>
	<title><![CDATA[GIGGLE: a search engine for large-scale integrated genome analysis]]></title>
	<description><![CDATA[<p><span>GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (</span><a href="https://github.com/ryanlayer/giggle">https://github.com/ryanlayer/giggle</a><span>) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.</span></p>
<p>https://www.nature.com/articles/nmeth.4556</p><p>Address of the bookmark: <a href="https://github.com/ryanlayer/giggle" rel="nofollow">https://github.com/ryanlayer/giggle</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35432/mummer4-a-fast-and-versatile-genome-alignment-system</guid>
	<pubDate>Sat, 03 Feb 2018 04:59:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35432/mummer4-a-fast-and-versatile-genome-alignment-system</link>
	<title><![CDATA[MUMmer4: A fast and versatile genome alignment system]]></title>
	<description><![CDATA[<p><span>MUMmer4, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of MUMmer to a 48-bit suffix array, and that offers improved speed through parallel processing of input query sequences. With a theoretical limit on the input size of 141Tbp, MUMmer4 can now work with input sequences of any biologically realistic length. We show that as a result of these enhancements, the&nbsp;</span><span>nucmer</span><span>&nbsp;program in MUMmer4 is easily able to handle alignments of large genomes;&nbsp;</span></p><p>Address of the bookmark: <a href="https://mummer4.github.io/" rel="nofollow">https://mummer4.github.io/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>