<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/30701?offset=100</link>
	<atom:link href="https://bioinformaticsonline.com/related/30701?offset=100" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28997/braker-pipeline-for-fully-automated-prediction-of-protein-coding-genes-with-genemark-eset-and-augustus-in-novel-eukaryotic-genomes</guid>
	<pubDate>Thu, 01 Sep 2016 08:02:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28997/braker-pipeline-for-fully-automated-prediction-of-protein-coding-genes-with-genemark-eset-and-augustus-in-novel-eukaryotic-genomes</link>
	<title><![CDATA[BRAKER: pipeline for fully automated prediction of protein coding genes with GeneMark-ES/ET and AUGUSTUS in novel eukaryotic genomes]]></title>
	<description><![CDATA[<p><span>Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction.</span></p>
<p>http://www.ncbi.nlm.nih.gov/pubmed/26559507</p><p>Address of the bookmark: <a href="http://bioinf.uni-greifswald.de/bioinf/braker/" rel="nofollow">http://bioinf.uni-greifswald.de/bioinf/braker/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29029/ngs-tutorial</guid>
	<pubDate>Mon, 05 Sep 2016 09:50:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29029/ngs-tutorial</link>
	<title><![CDATA[NGS Tutorial]]></title>
	<description><![CDATA[<p><span>These tutorials are written for hundreds of bioinformaticians trying to cope with large volume of next-generation sequencing (NGS) data. NGS technologies brought a dramatic shift in the world of sequencing. Merely five years back, genome sequencing of higher eukaryotes used to be very expensive endeavor. To get a genome of interest sequenced, hundreds of scientists had to raise funds together by writing a joint white-paper and petitioning to various government agencies. The tasks of sequencing and assembly were handled by dedicated sequencing facilities, of which only a few existed around the globe. Naturally, the capacities at those sequencing facilities were significantly constrained from high volume of requests</span></p><p>Address of the bookmark: <a href="http://www.homolog.us/Tutorials/index.php" rel="nofollow">http://www.homolog.us/Tutorials/index.php</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</guid>
	<pubDate>Mon, 10 Oct 2016 08:56:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</link>
	<title><![CDATA[PHYMMBL]]></title>
	<description><![CDATA[<p><span>Metagenomics sequencing projects collect samples of DNA from uncharacterized environments that may contain hundreds or even thousands of species. One of the main challenges in analyzing a metagenome is phylogenetic classification of raw sequence reads into groups representing the same or similar species. Such classification is a useful prerequisite for genome assembly and for analysis of the biological diversity present in a sample. The newest sequencing technologies have simultaneously made metagenomics easier, by making the sequencing process faster, and more difficult, by producing shorter read lengths than previous technologies. Methods for classifying sequences as short as 100 base pairs (bp) have until now been relatively inaccurate, requiring metagenomics projects to use older, long-read technologies.&nbsp;</span><strong>Phymm</strong><span>, a new classification approach for metagenomics data which uses interpolated Markov models (IMMs) to taxonomically classify DNA sequences, can accurately classify reads as short as 100 bp. Its accuracy for short reads represents a significant leap forward over previous composition-based classification methods.&nbsp;</span><strong>PhymmBL</strong><span>&nbsp;(rhymes with "thimble"), the hybrid classifier included in this distribution which combines analysis from both Phymm and&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/BLAST">BLAST</a><span>, produces even higher accuracy.</span></p><p>Address of the bookmark: <a href="http://www.cbcb.umd.edu/software/phymm/" rel="nofollow">http://www.cbcb.umd.edu/software/phymm/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30153/e-mem-efficient-computation-of-maximal-exact-matches</guid>
	<pubDate>Thu, 15 Dec 2016 09:30:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30153/e-mem-efficient-computation-of-maximal-exact-matches</link>
	<title><![CDATA[E-MEM: Efficient computation of Maximal Exact Matches]]></title>
	<description><![CDATA[<p>E-MEM is a C++/OpenMP program designed to efficiently compute MEMs between large genomes. See the README file for instructions on how to use E-MEM.&nbsp;<br><br>E-MEM source code</p>
<p>The source code can be downloaded&nbsp;<a href="http://www.csd.uwo.ca/~ilie/E-MEM/e-mem.zip">here</a>.&nbsp;<br><br>If you use E-MEM, please cite:</p>
<ul>
<li>N. Khiste, L. Ilie, E-MEM: Efficient computation of Maximal Exact Matches for very large genomes,&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/31/4/509.short">Bioinformatics</a>&nbsp;<strong>31</strong>(4) (2015) 509 -- 514.</li>
</ul>
<p>For any questions, please contact Lucian Ilie:&nbsp;<a href="mailto:ilie@uwo.ca">ilie@uwo.ca</a>&nbsp;</p><p>Address of the bookmark: <a href="http://www.csd.uwo.ca/~ilie/E-MEM/" rel="nofollow">http://www.csd.uwo.ca/~ilie/E-MEM/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30304/mcscan</guid>
	<pubDate>Thu, 22 Dec 2016 03:53:58 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30304/mcscan</link>
	<title><![CDATA[MCscan]]></title>
	<description><![CDATA[<p><span>MCscan is a computer program that can simultaneously scan multiple genomes to identify homologous chromosomal regions and subsequently align these regions using genes as anchors. This is the toolset for generating the synteny correspondences in&nbsp;</span><a href="http://chibba.agtec.uga.edu/duplication">Plant Genome Duplication Database</a><span>. It is intended as an easy-to-use and quick way to identify conserved gene arrays both within the same genome and across different genomes.</span></p>
<p><span>More at&nbsp;http://chibba.agtec.uga.edu/duplication/mcscan/</span></p><p>Address of the bookmark: <a href="http://chibba.agtec.uga.edu/duplication/mcscan/" rel="nofollow">http://chibba.agtec.uga.edu/duplication/mcscan/</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</guid>
	<pubDate>Wed, 06 Jan 2021 03:51:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</link>
	<title><![CDATA[Sample bandage input file for visual analysis]]></title>
	<description><![CDATA[<p>Sample bandage input file for visual analysis ...</p>]]></description>
	<dc:creator>Jit</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/42559" length="112199" type="text/plain" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</guid>
	<pubDate>Wed, 15 Mar 2017 14:31:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</link>
	<title><![CDATA[Software and Tools to detect structure variation with long reads !!]]></title>
	<description><![CDATA[<p>Uncovering the connection between genetics and heritable diseases requires an approach that looks at all the variant bases and types in a genome. While a PacBio&nbsp;<em>de novo</em>&nbsp;assembly resolves the most novel SV variants. 8-10X PacBio coverage of single genomes or trios reveals triple the SVs detectable by short-read data.</p><p>With&nbsp;<span style="text-decoration: underline;"><a href="http://www.pacb.com/smrt-science/">Single Molecule, Real-Time (SMRT) Sequencing</a></span>, you can access structural variations having a broad range of sizes, types, and GC content with the ability to:</p><ul>
<li>Uncover missing heritability linked to structural variation</li>
<li>Unambiguously identify genomic context and variant breakpoints at the sequence level to unravel the genetic etiology of disease</li>
<li>Resolve structural variation across the complete size spectrum with basepair resolution</li>
</ul><p>Following are the SV tools, which can assist you to achieve your goal.</p><p><strong>Sniffles:</strong>&nbsp;Structural variation caller using third generation sequencing</p><p>Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter) or NGM-LR with the optional SAM attributes enabled!&nbsp;</p><p>More at&nbsp;https://github.com/fritzsedlazeck/Sniffles</p><p><strong style="font-size: 12.8px;"><br />MultiBreak-SV:</strong> It identifies structural variants from next-generation paired end data, third-generation long read data, or data from a combination of sequencing platforms.</p><p>There are two pieces of software in this release: (1) a pre-processor that takes machineformat (.m5) BLASR files, and (2) MultiBreak-SV. For installation and usage instructions, see doc/MultiBreakSV-Manual.txt.</p><p>More at&nbsp;https://github.com/raphael-group/multibreak-sv</p><p><strong style="font-size: 12.8px;"><br />Parliament:</strong>&nbsp;A Structural Variation Tool. Why ask a single sv-detection approach to find every variant when you can have a parliament of tools deciding?</p><p>Publication about the algorithm and &ldquo;&hellip;the first long-read characterization of structural variation in a diploid human personal genome&hellip;&rdquo; (HS1011) -&nbsp;<a href="http://www.biomedcentral.com/1471-2164/16/286">&ldquo;Assessing structural variation in a personal genome&mdash;towards a human reference diploid genome&rdquo;</a></p><p>More at&nbsp;https://sourceforge.net/projects/parliamentsv/</p><p>https://www.dnanexus.com/papers/Parliament_Info_Sheet.pdf</p><p><br /><strong>PBHoney:</strong>&nbsp;the structural variation discovery tool&nbsp;<br /><br />PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</p><p>Read The Paper&nbsp;<a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a></p><p>More at&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong><br />SMRT-SV:</strong> Structural variant and indel caller for PacBio reads</p><p>Structural variant (SV) and indel caller for PacBio reads based on methods from&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>.</p><p>SMRT-SV provides an official software package for tools described in&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>&nbsp;and adds several key features including the following.</p><ul>
<li>Unified variant calling user interface with built-in cluster compute support</li>
<li>Small indel calling (2-49 bp)</li>
<li>Improved inversion calling (<code>screenInversions</code>)</li>
<li>Quality metric for SV calls based on number of local assemblies supporting each call</li>
<li>Higher sensitivity for SV calls using tiled local assemblies across the entire genome instead of "signature" regions</li>
<li>Genotyping of SVs with Illumina paired-end reads from WGS samples</li>
</ul><p>More at&nbsp;https://github.com/EichlerLab/pacbio_variant_caller</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27090/canu-assembling-large-genomes-with-single-molecule-sequencing-and-locality-sensitive-hashing</guid>
	<pubDate>Tue, 26 Apr 2016 11:38:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27090/canu-assembling-large-genomes-with-single-molecule-sequencing-and-locality-sensitive-hashing</link>
	<title><![CDATA[CANU: Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing.]]></title>
	<description><![CDATA[<p>Canu is a fork of the&nbsp;<a href="http://wgs-assembler.sourceforge.net/wiki/index.php?title=Main_Page" title="Celera Assembler">Celera Assembler</a>&nbsp;designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). The software is currently alpha level, feel free to use and report issues encountered.</p>
<p>Canu is a hierachical assembly pipeline which runs in four steps:</p>
<ul>
<li>Detect overlaps in high-noise sequences using&nbsp;<a href="https://github.com/marbl/MHAP" title="MHAP">MHAP</a></li>
<li>Generate corrected sequence consensus</li>
<li>Trim corrected sequences</li>
<li>Assemble trimmed corrected sequences</li>
</ul>
<p>Read the&nbsp;<a href="http://canu.readthedocs.org/" title="docs">documentation</a></p>
<p>New release https://github.com/marbl/canu/releases</p><p>Address of the bookmark: <a href="https://github.com/marbl/canu" rel="nofollow">https://github.com/marbl/canu</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30625/pandaseq</guid>
	<pubDate>Mon, 23 Jan 2017 04:54:32 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30625/pandaseq</link>
	<title><![CDATA[PANDASEQ]]></title>
	<description><![CDATA[<p>PANDASEQ assembles paired-end Illumina reads into sequences, trying to correct for errors and uncalled bases. The assembler reads two files in FASTQ format with quality information. If amplification primers were used (e.g., to isolate a variable region of the 16S gene, or the constant regions around zinc finger binding residues), they can be removed from the sequence during assembly. The final sequence will correct any uncalled bases in the overlapping region using the complementary strand. When mismatches occur in the overlapping region, the base with the better quality score is chosen.<br>The algorithm is as follows:<br><br>1.Find the positions where the forward and reverse primers match best above the threshold and discard the ends of the sequence, including the primer.<br>2.Pick and overlap to maximise the probability of the forward and reverse reads having come from a single piece of DNA.<br>3.Identify the masking of the end of the read with the quality score B or # as done by CASAVA and adjust the probabilities in this region.<br>4.Construct an assembled sequence between the primers and calculate the quality.<br>5.Check for various constraints, including quality, length, uncalled bases, and user-supplied modules.</p>
<p>http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html</p><p>Address of the bookmark: <a href="http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html" rel="nofollow">http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29992/spines</guid>
	<pubDate>Mon, 28 Nov 2016 05:33:26 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29992/spines</link>
	<title><![CDATA[Spines]]></title>
	<description><![CDATA[<p><a href="https://www.broadinstitute.org/ftp/distribution/software/spines/"><em>Spines</em></a>&nbsp;is a collection of software tools, developed and used by the Vertebrate Genome Biology Group at the Broad Institute. It provides basic data structures for efficient data manipulation (mostly genomic sequences, alignments, variation etc.), as well as specialized tool sets for various analyses. It also features three sequence alignment packages:&nbsp;<em>Satsuma,</em>&nbsp;a highly parallelized program for high-sensitivity, genome-wide synteny;&nbsp;<em>Papaya,</em>&nbsp;an all-purpose alignment tool for less diverged sequences; and&nbsp;<em>SLAP,</em>&nbsp;a context-sensitive local aligner for diverged sequences with large gaps.</p>
<p>Access&nbsp;<em>Spines</em>&nbsp;<a href="https://www.broadinstitute.org/ftp/distribution/software/spines/">here</a>.</p><p>Address of the bookmark: <a href="https://www.broadinstitute.org/genome-sequencing-and-analysis/spines" rel="nofollow">https://www.broadinstitute.org/genome-sequencing-and-analysis/spines</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>