<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/26426?offset=80</link>
	<atom:link href="https://bioinformaticsonline.com/related/26426?offset=80" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31064/cgaln</guid>
	<pubDate>Wed, 22 Feb 2017 05:14:15 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31064/cgaln</link>
	<title><![CDATA[Cgaln]]></title>
	<description><![CDATA[<p>Cgaln (Coarse grained alignment) is a program designed to align a pair of whole genomic sequences of not only bacteria but also entire chromosomes of vertebrates on a nominal desktop computer. Cgaln performs an alignment job in two steps, at the block level and then at the nucleotide level. The former "coarse-grained" alignment can explore genomic rearrangements and reduce the regions to be analyzed in the next step. The latter is devoted to detailed alignment within the limited regions found in the first stage. The output of Cgaln is 'glocal' in the sense that rearrangements are taken into consideration while each alignable region is extended as long as possible. Thus, Cgaln is not only fast and memory-efficient, but also can filter noisy outputs without missing the most important homologous segment pairs.</p>
<p>http://www.iam.u-tokyo.ac.jp/chromosomeinformatics/rnakato/cgaln/</p><p>Address of the bookmark: <a href="http://www.iam.u-tokyo.ac.jp/chromosomeinformatics/rnakato/cgaln/" rel="nofollow">http://www.iam.u-tokyo.ac.jp/chromosomeinformatics/rnakato/cgaln/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30829/mercator</guid>
	<pubDate>Mon, 06 Feb 2017 04:20:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30829/mercator</link>
	<title><![CDATA[Mercator]]></title>
	<description><![CDATA[<p><span>Our basic strategy in building homology maps is to use exons that are orthologous in multiple genomes as map "anchors." Given K genomes, the steps in the map construction are as follows:</span></p>
<ul>
<li>For each genome, obtain a set of exon annotations. These annotations can be a combination of both exon predictions (e.g. Genscan) and annotations that have been experimentally verified (e.g. RefSeq). Ideally, we would like to have these annotations be as sensitive as possible. Specificity is not a concern, as incorrect annotations are not likely not have significant alignments with other gene annotations.</li>
<li>Compare all exons against all exons in other genomes and record significant alignments between exons. Currently, we use&nbsp;<a href="https://www.biostat.wisc.edu/~cdewey/mercator/#refBLAT">BLAT</a>&nbsp;to do this all-vs-all comparison with alignments being performed in protein space.</li>
<li>Construct a graph with each vertex corresponding to a exon and edges between vertices whose corresponding exons have significant alignments.</li>
<li>Identify cliques in this graph. These cliques are potential anchors to be used in the map.</li>
<li>Starting with the largest cliques (those that have exons in all or most of the genomes), join neighboring (adjacent in genomic coordinates, in each genome) cliques to form&nbsp;runs. Smaller cliques that are inconsistent with runs formed by larger cliques are filtered out. After the smallest cliques have been considered, cliques that are not part of a run are discarded.</li>
<li>The extents of each run in each genome are outputted as orthologous segments. The cliques from each run are used to output the exact genomic coordinates of anchors within each orthologous segment. These anchors can be used by genomic alignment programs (such as&nbsp;<a href="https://www.biostat.wisc.edu/~cdewey/mercator/#refMAVID">MAVID</a>) to do a detailed alignment of each orthologous segment.</li>
</ul>
<p>https://www.biostat.wisc.edu/~cdewey/mercator/</p><p>Address of the bookmark: <a href="https://www.biostat.wisc.edu/~cdewey/mercator/" rel="nofollow">https://www.biostat.wisc.edu/~cdewey/mercator/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31089/conpade-genome-assembly-ploidy-estimation-from-next-generation-sequencing-data</guid>
	<pubDate>Fri, 24 Feb 2017 04:55:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31089/conpade-genome-assembly-ploidy-estimation-from-next-generation-sequencing-data</link>
	<title><![CDATA[ConPADE: Genome Assembly Ploidy Estimation from Next-Generation Sequencing Data]]></title>
	<description><![CDATA[<p><span>ConPADE (Contig Ploidy and Allele Dosage Estimation), a probabilistic method that estimates the ploidy of any given contig/scaffold based on its allele proportions. In the process, they report findings regarding errors in sequencing. The method can be used for whole genome shotgun (WGS) sequencing data. They also show applicability of the method for variant calling and allele dosage estimation. Results for simulated and real datasets are discussed and provide evidence that ConPADE performs well as long as enough sequencing coverage is available, or the true contig ploidy is low.&nbsp;</span></p>
<p><span>https://github.com/microsoftgenomics</span></p><p>Address of the bookmark: <a href="https://github.com/microsoftgenomics" rel="nofollow">https://github.com/microsoftgenomics</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31139/pbsuite-software-for-long-read-sequencing-data-from-pacbio</guid>
	<pubDate>Mon, 27 Feb 2017 09:54:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31139/pbsuite-software-for-long-read-sequencing-data-from-pacbio</link>
	<title><![CDATA[PBSuite: Software for Long-Read Sequencing Data from PacBio]]></title>
	<description><![CDATA[<p><span>PBJelly - the genome upgrading tool.&nbsp;</span><br><span>PBHoney - the structural variation discovery tool&nbsp;</span><br><br><span>Both are contained within the PBSuite code found in downloads.</span><br><br><span>----- PBJelly -----</span><br><span>Read The Paper&nbsp;</span><br><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047768" target="_blank">http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047768</a><br><br><span>PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes.&nbsp;</span><br><br><span>----- PBHoney -----</span><br><span>Read The Paper</span><br><a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a><br><br><span>PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/pb-jelly/" rel="nofollow">https://sourceforge.net/projects/pb-jelly/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31295/mycc-accurate-binning-of-metagenomic-contigs-via-automated-clustering-sequences-using-information-of-genomic-signatures-and-marker-genes</guid>
	<pubDate>Fri, 03 Mar 2017 08:34:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31295/mycc-accurate-binning-of-metagenomic-contigs-via-automated-clustering-sequences-using-information-of-genomic-signatures-and-marker-genes</link>
	<title><![CDATA[MyCC: Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes]]></title>
	<description><![CDATA[<p><span>MyCC, an automated binning tool that combines genomic signatures, marker genes and optional contig coverages within one or multiple samples, in order to visualize the metagenomes and to identify the reconstructed genomic fragments.</span></p>
<p><span>More at&nbsp;http://www.nature.com/articles/srep24175</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/sb2nhri/files/MyCC/" rel="nofollow">https://sourceforge.net/projects/sb2nhri/files/MyCC/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31345/prokka-tool-for-the-rapid-annotation-of-prokaryotic-genomes</guid>
	<pubDate>Mon, 06 Mar 2017 03:49:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31345/prokka-tool-for-the-rapid-annotation-of-prokaryotic-genomes</link>
	<title><![CDATA[Prokka: tool for the rapid annotation of prokaryotic genomes]]></title>
	<description><![CDATA[<p>Prokka is a software tool for the rapid annotation of prokaryotic genomes. A typical 4 Mbp genome can be fully annotated in less than 10 minutes on a quad-core computer, and scales well to 32 core SMP systems. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="http://www.vicbioinformatics.com/software.prokka.shtml" rel="nofollow">http://www.vicbioinformatics.com/software.prokka.shtml</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31375/cocacola-binning-metagenomic-contigs-using-sequence-composition-read-coverage-co-alignment-and-paired-end-read-linkage</guid>
	<pubDate>Tue, 07 Mar 2017 08:50:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31375/cocacola-binning-metagenomic-contigs-using-sequence-composition-read-coverage-co-alignment-and-paired-end-read-linkage</link>
	<title><![CDATA[COCACOLA (binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment, and paired-end read LinkAge)]]></title>
	<description><![CDATA[<p>COCACOLA is a general framework that combines different types of information: sequence COmposition, CoverAge across multiple samples, CO-alignment to reference genomes and paired-end reads LinkAge to automatically bin contigs into OTUs. Furthermore, COCACOLA seamlessly embraces customized prior knowledge to facilitate binning accuracy.</p>
<p>News: Python version of COCACOLA is available now!</p><p>Address of the bookmark: <a href="https://github.com/younglululu/COCACOLA" rel="nofollow">https://github.com/younglululu/COCACOLA</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</guid>
	<pubDate>Wed, 15 Mar 2017 14:31:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</link>
	<title><![CDATA[Software and Tools to detect structure variation with long reads !!]]></title>
	<description><![CDATA[<p>Uncovering the connection between genetics and heritable diseases requires an approach that looks at all the variant bases and types in a genome. While a PacBio&nbsp;<em>de novo</em>&nbsp;assembly resolves the most novel SV variants. 8-10X PacBio coverage of single genomes or trios reveals triple the SVs detectable by short-read data.</p><p>With&nbsp;<span style="text-decoration: underline;"><a href="http://www.pacb.com/smrt-science/">Single Molecule, Real-Time (SMRT) Sequencing</a></span>, you can access structural variations having a broad range of sizes, types, and GC content with the ability to:</p><ul>
<li>Uncover missing heritability linked to structural variation</li>
<li>Unambiguously identify genomic context and variant breakpoints at the sequence level to unravel the genetic etiology of disease</li>
<li>Resolve structural variation across the complete size spectrum with basepair resolution</li>
</ul><p>Following are the SV tools, which can assist you to achieve your goal.</p><p><strong>Sniffles:</strong>&nbsp;Structural variation caller using third generation sequencing</p><p>Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter) or NGM-LR with the optional SAM attributes enabled!&nbsp;</p><p>More at&nbsp;https://github.com/fritzsedlazeck/Sniffles</p><p><strong style="font-size: 12.8px;"><br />MultiBreak-SV:</strong> It identifies structural variants from next-generation paired end data, third-generation long read data, or data from a combination of sequencing platforms.</p><p>There are two pieces of software in this release: (1) a pre-processor that takes machineformat (.m5) BLASR files, and (2) MultiBreak-SV. For installation and usage instructions, see doc/MultiBreakSV-Manual.txt.</p><p>More at&nbsp;https://github.com/raphael-group/multibreak-sv</p><p><strong style="font-size: 12.8px;"><br />Parliament:</strong>&nbsp;A Structural Variation Tool. Why ask a single sv-detection approach to find every variant when you can have a parliament of tools deciding?</p><p>Publication about the algorithm and &ldquo;&hellip;the first long-read characterization of structural variation in a diploid human personal genome&hellip;&rdquo; (HS1011) -&nbsp;<a href="http://www.biomedcentral.com/1471-2164/16/286">&ldquo;Assessing structural variation in a personal genome&mdash;towards a human reference diploid genome&rdquo;</a></p><p>More at&nbsp;https://sourceforge.net/projects/parliamentsv/</p><p>https://www.dnanexus.com/papers/Parliament_Info_Sheet.pdf</p><p><br /><strong>PBHoney:</strong>&nbsp;the structural variation discovery tool&nbsp;<br /><br />PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</p><p>Read The Paper&nbsp;<a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a></p><p>More at&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong><br />SMRT-SV:</strong> Structural variant and indel caller for PacBio reads</p><p>Structural variant (SV) and indel caller for PacBio reads based on methods from&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>.</p><p>SMRT-SV provides an official software package for tools described in&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>&nbsp;and adds several key features including the following.</p><ul>
<li>Unified variant calling user interface with built-in cluster compute support</li>
<li>Small indel calling (2-49 bp)</li>
<li>Improved inversion calling (<code>screenInversions</code>)</li>
<li>Quality metric for SV calls based on number of local assemblies supporting each call</li>
<li>Higher sensitivity for SV calls using tiled local assemblies across the entire genome instead of "signature" regions</li>
<li>Genotyping of SVs with Illumina paired-end reads from WGS samples</li>
</ul><p>More at&nbsp;https://github.com/EichlerLab/pacbio_variant_caller</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32154/decostar-detection-of-co-evolution</guid>
	<pubDate>Fri, 14 Apr 2017 06:27:25 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32154/decostar-detection-of-co-evolution</link>
	<title><![CDATA[DeCoSTAR - Detection of Co-evolution]]></title>
	<description><![CDATA[<p><span>DeCoSTAR is a software which aims at reconstructing ancestral gene or genome organizations, in the form of sets of neighborhood relations -adjacencies- between pairs of ancestral genes or gene domains.</span><br><span>Ancestral genes or domains are deduced from reconciled gene trees in a context of birth, speciation, duplication, loss, transfer, which are either given as input or computed with the&nbsp;</span><a href="http://mbb.univ-montp2.fr/MBB/download_sources/16__TERA">ecceTERA package</a><span>, to which DeCoSTAR is integrated. DeCoSTAR constructs parsimonious scenarios of gains and breakages of adjacencies, and contains in particular all the features of previous software DeCo, DeCoLT, ArtDeCo and DeClone. It provides statistical supports on ancestral adjacencies, or the possibility to handle badly assembled genomes.&nbsp;</span><br><span>DeCoSTAR is able to reconstruct the histories of domains inside genes, including gene fusion and fission events, as well as ancestral genome structures for dozens of whole genomes from all kingdoms of life in a few minutes.</span></p><p>Address of the bookmark: <a href="http://pbil.univ-lyon1.fr/software/DeCoSTAR/" rel="nofollow">http://pbil.univ-lyon1.fr/software/DeCoSTAR/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32399/mapping-ngs</guid>
	<pubDate>Tue, 02 May 2017 07:58:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32399/mapping-ngs</link>
	<title><![CDATA[Mapping NGS]]></title>
	<description><![CDATA[<p>NGS data are just a bunch of sequences, you have no idea which region in the genome each sequences comes from, which gene it represents...<br>To know that you have to align the sequences to the reference sequence. The reference sequence is in most cases the full genome sequence but sometimes, a library of EST sequences is used.<br>In either way, aligning your sequence reads to the reference sequence is called mapping.</p>
<p>The most used mappers of DNA-seq data are&nbsp;<a href="http://bio-bwa.sourceforge.net/" target="_blank">BWA</a>&nbsp;and&nbsp;<a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml" target="_blank">Bowtie</a>&nbsp;for DNA-Seq data and&nbsp;<a href="http://tophat.cbcb.umd.edu/" target="_blank">Tophat</a>,&nbsp;<a href="https://github.com/alexdobin/STAR" target="_blank">STAR</a>&nbsp;or&nbsp;<a href="http://www.ccb.jhu.edu/software/hisat/index.shtml" target="_blank">HISAT</a>&nbsp;for RNA-Seq data. Mappers differ in which options they can take in, how fast and how accurate they are. Bowtie is faster than BWA, but looses some sensitivity (does not map an equal amount of reads to the correct position in the genome).</p><p>Address of the bookmark: <a href="http://wiki.bits.vib.be/index.php/Mapping_of_NGS_data" rel="nofollow">http://wiki.bits.vib.be/index.php/Mapping_of_NGS_data</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>