<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/19560?offset=410</link>
	<atom:link href="https://bioinformaticsonline.com/related/19560?offset=410" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29272/decipher</guid>
	<pubDate>Fri, 30 Sep 2016 09:33:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29272/decipher</link>
	<title><![CDATA[DECIPHER]]></title>
	<description><![CDATA[<p>DECIPHER is a software toolset that can be used to maintain, analyze, and decipher large amounts of DNA sequence data. To install DECIPHER, see the <a href="http://DECIPHER.cee.wisc.edu/Download.html">Downloads</a> page.<br><br> To begin using DECIPHER read the "Getting Started DECIPHERing" tutorial. Refer to the PDF documents below for instructions on how to use DECIPHER for various tasks.</p><p>Address of the bookmark: <a href="http://decipher.cee.wisc.edu/Documentation.html" rel="nofollow">http://decipher.cee.wisc.edu/Documentation.html</a></p>]]></description>
	<dc:creator>Anjana</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29284/genebreak-a-tool-to-systematically-identify-genes-recurrently-affected-by-the-genomic-location-of-chromosomal-cna-associated-breaks-by-a-genome-wide-approach</guid>
	<pubDate>Sat, 01 Oct 2016 15:15:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29284/genebreak-a-tool-to-systematically-identify-genes-recurrently-affected-by-the-genomic-location-of-chromosomal-cna-associated-breaks-by-a-genome-wide-approach</link>
	<title><![CDATA[GeneBreak: a tool to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach]]></title>
	<description><![CDATA[<p>Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. ‘GeneBreak’ is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, ‘GeneBreak’ collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, ‘GeneBreak’, is implemented in R (www.cran.r-project.org) and is available from Bioconductor (www.bioconductor.org/packages/release/bioc/html/GeneBreak.html).</p>
<p> </p><p>Address of the bookmark: <a href="http://www.bioconductor.org/packages/release/bioc/html/GeneBreak.html" rel="nofollow">http://www.bioconductor.org/packages/release/bioc/html/GeneBreak.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</guid>
	<pubDate>Mon, 10 Oct 2016 08:56:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</link>
	<title><![CDATA[PHYMMBL]]></title>
	<description><![CDATA[<p><span>Metagenomics sequencing projects collect samples of DNA from uncharacterized environments that may contain hundreds or even thousands of species. One of the main challenges in analyzing a metagenome is phylogenetic classification of raw sequence reads into groups representing the same or similar species. Such classification is a useful prerequisite for genome assembly and for analysis of the biological diversity present in a sample. The newest sequencing technologies have simultaneously made metagenomics easier, by making the sequencing process faster, and more difficult, by producing shorter read lengths than previous technologies. Methods for classifying sequences as short as 100 base pairs (bp) have until now been relatively inaccurate, requiring metagenomics projects to use older, long-read technologies.&nbsp;</span><strong>Phymm</strong><span>, a new classification approach for metagenomics data which uses interpolated Markov models (IMMs) to taxonomically classify DNA sequences, can accurately classify reads as short as 100 bp. Its accuracy for short reads represents a significant leap forward over previous composition-based classification methods.&nbsp;</span><strong>PhymmBL</strong><span>&nbsp;(rhymes with "thimble"), the hybrid classifier included in this distribution which combines analysis from both Phymm and&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/BLAST">BLAST</a><span>, produces even higher accuracy.</span></p><p>Address of the bookmark: <a href="http://www.cbcb.umd.edu/software/phymm/" rel="nofollow">http://www.cbcb.umd.edu/software/phymm/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29586/eforgev12</guid>
	<pubDate>Fri, 28 Oct 2016 09:06:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29586/eforgev12</link>
	<title><![CDATA[eFORGE.v1.2]]></title>
	<description><![CDATA[<p><span>The eFORGE tool provides a method to view the tissue specific regulatory component of a set of EWAS DMPs. eFORGE analysis takes a set of DMPs, such as those hits above genome-wide significance threshold in an EWAS study, and analyses whether there is enrichment for overlap of putative functional elements compared to matched background DMPs. It assesses enrichment on a per cell type basis, since functional elements are differentially active in different cell types, and hence can expose tissue-specific signals of enrichment for the given test DMP set. This can reveal the sites of action underlying the EWAS signal, and provide confirmation of the validity of the EWAS where a tissue-specific mechanism is known or expected for the phenotype. Conversely unknown tissue involvements can also be revealed.</span></p><p>Address of the bookmark: <a href="http://eforge.cs.ucl.ac.uk/eFORGE.v1.2/?documentation" rel="nofollow">http://eforge.cs.ucl.ac.uk/eFORGE.v1.2/?documentation</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</guid>
	<pubDate>Tue, 22 Nov 2016 04:51:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</link>
	<title><![CDATA[Maq: Mapping and Assembly with Quality]]></title>
	<description><![CDATA[<p><strong>Maq</strong>&nbsp;stands for&nbsp;<em>Mapping and Assembly with Quality</em>&nbsp;It builds assembly by mapping short reads to reference sequences. Maq is a project hosted by&nbsp;<a href="http://sourceforge.net/">SourceForge.net</a>. The project page is available at<a href="http://sourceforge.net/projects/maq/">http://sourceforge.net/projects/maq/</a>. Maq is previously known as mapass2.</p>
<h2>Run Maq Now</h2>
<p>Follow these steps to try Maq. All you need is a reference sequence file in the FASTA format.</p>
<ol>
<li>Prepare a reference sequence (ref.fasta). Better a bacterial genome.</li>
<li>Download maq, maq-data and maqview at the&nbsp;<a href="http://sourceforge.net/project/showfiles.php?group_id=191815">download page</a>.</li>
<li>Copy maq, maq.pl and maq_eval.pl to the $PATH or to the same directory.</li>
<li>Simulate diploid reference and read sequences, map reads, call variants and evaluate the results in one go:
<pre>maq.pl demo ref.fasta calib-30.dat
</pre>
where&nbsp;<em>calib-30.dat</em>&nbsp;is contained in maq-data.</li>
<li>View the alignment:
<pre>cd maqdemo/easyrun;
maqindex -i -c consensus.cns all.map;
maqview -c consensus.cns all.map</pre>
</li>
</ol>
<p><strong>Even for advanced maq users, running `maq.pl demo' is recommended. You may find something helpful.</strong></p><p>Address of the bookmark: <a href="http://maq.sourceforge.net" rel="nofollow">http://maq.sourceforge.net</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30090/standardized-velvet-assembly-report</guid>
	<pubDate>Fri, 09 Dec 2016 03:59:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30090/standardized-velvet-assembly-report</link>
	<title><![CDATA[Standardized velvet assembly report]]></title>
	<description><![CDATA[<p>Requirements:</p>
<ul>
<li>velvet (velveth velvetg should be in your PATH)</li>
<li>R (with Sweave)</li>
<li>pdflatex (usually part of TeTeX)</li>
<li>ggplot2 (from R prompt type install.packages("ggplot2","proto","xtable"))</li>
<li>Perl</li>
</ul>
<p>Optional:</p>
<ul>
<li>BLAT or BLAST (to generate alignments against a reference genome). If using BLAT, add faToTwoBit,gfClient,gfServer to your PATH. If using BLAST, add blastall and formatdb.</li>
</ul>
<p>Edit permute.sh to your liking, paying particular attention to the kmer, cvCut, expCov, and other flags</p>
<p>To Run:</p>
<ol>
<li><code>perl fastaAllSize mysequences.fa &gt; mysequences.stat or gunzip -c mysequences.fa.gz | fastaAllSize &gt; mysequences.stat</code>&nbsp;Substitute fastqAllSize for fastq files.</li>
<li><code>./permute.sh mysequences</code>&nbsp;(leave out the .fa)</li>
</ol>
<p>https://github.com/leipzig/standardized-velvet-assembly-report</p><p>Address of the bookmark: <a href="https://github.com/leipzig/standardized-velvet-assembly-report" rel="nofollow">https://github.com/leipzig/standardized-velvet-assembly-report</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30124/understanding-greedy-algorithms</guid>
	<pubDate>Mon, 12 Dec 2016 04:37:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30124/understanding-greedy-algorithms</link>
	<title><![CDATA[Understanding Greedy Algorithms]]></title>
	<description><![CDATA[<p>Learning greedy algo for biologist.&nbsp;</p>
<p>https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/</p>
<p>This webpage is also useful for the same:</p>
<p>http://learninglover.com/examples.php?id=59</p>
<p>http://www.cs.rpi.edu/~magdon/ps/conference/super_biokdd.pdf</p>
<p>https://ocw.mit.edu/courses/biology/7-91j-foundations-of-computational-and-systems-biology-spring-2014/lecture-slides/MIT7_91JS14_Lecture6.pdf</p>
<p>http://schatzlab.cshl.edu/teaching/AssemblyClass/01.%20Assembly%20Intro.pdf</p>
<p>http://lsl.sinica.edu.tw/Services/Class/files/20150612449.pdf</p>
<p>http://www.cs.jhu.edu/~langmea/resources/lecture_notes/assembly_scs.pdf</p>
<p>https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-43.pdf</p><p>Address of the bookmark: <a href="https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/" rel="nofollow">https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30149/mypro-a-seamless-pipeline-for-automated-prokaryotic-genome-assembly-and-annotation</guid>
	<pubDate>Thu, 15 Dec 2016 05:47:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30149/mypro-a-seamless-pipeline-for-automated-prokaryotic-genome-assembly-and-annotation</link>
	<title><![CDATA[MyPro: A seamless pipeline for automated prokaryotic genome assembly and annotation]]></title>
	<description><![CDATA[<p>MyPro is an improved genomics software pipeline for prokaryotic genomes. MyPro is user-friendly and requires minimal programming skills. High-quality prokaryotic genome assembly and annotation can be obtained with ease. It performed better than de novo assemblers and contig integration software. Produces more contiguous assemblies, higher N50 values and lower number of contigs.</p>
<p>More at https://sourceforge.net/projects/sb2nhri/files/MyPro/</p><p>Address of the bookmark: <a href="http://www.sciencedirect.com/science/article/pii/S0167701215001207" rel="nofollow">http://www.sciencedirect.com/science/article/pii/S0167701215001207</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30212/pear</guid>
	<pubDate>Mon, 19 Dec 2016 09:28:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30212/pear</link>
	<title><![CDATA[PEAR]]></title>
	<description><![CDATA[<p><strong>PEAR</strong>&nbsp;is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.</p>
<p>PEAR evaluates all possible paired-end read overlaps and without requiring the target fragment size as input. In addition, it implements a statistical test for minimizing false-positive results. Together with a highly optimized implementation, it can merge millions of paired end reads within a couple of minutes on a standard desktop computer.</p><p>Address of the bookmark: <a href="http://sco.h-its.org/exelixis/web/software/pear/doc.html" rel="nofollow">http://sco.h-its.org/exelixis/web/software/pear/doc.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</guid>
	<pubDate>Mon, 19 Dec 2016 14:20:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</link>
	<title><![CDATA[pyScaf]]></title>
	<description><![CDATA[<p>pyScaf orders contigs from genome assemblies utilising several types of information:</p>
<ul>
<li>paired-end (PE) and/or mate-pair libraries (<a href="https://github.com/lpryszcz/pyScaf#ngs-based-scaffolding">NGS-based mode</a>)</li>
<li>long reads (<a href="https://github.com/lpryszcz/pyScaf#scaffolding-based-on-long-reads">NGS-based mode</a>)</li>
<li>synteny to the genome of some related species (<a href="https://github.com/lpryszcz/pyScaf#reference-based-scaffolding">reference-based mode</a>)</li>
</ul>
<p>Scaffolding&nbsp;</p>
<p>In reference-based mode, pyScaf uses synteny to the genome of closely related species in order to order contigs and estimate distances between adjacent contigs.</p>
<p>Contigs are aligned globally (end-to-end) onto reference chromosomes, ignoring:</p>
<ul>
<li>matches not satisfying cut-offs (<code>--identity</code>&nbsp;and&nbsp;<code>--overlap</code>)</li>
<li>suboptimal matches (only best match of each query to reference is kept)</li>
<li>and removing overlapping matches on reference.</li>
</ul>
<p>In preliminary tests, pyScaf performed superbly on simulated heterozygous genomes based on&nbsp;<em>C. parapsilosis</em>&nbsp;(13 Mb; CANPA) and&nbsp;<em>A. thaliana</em>&nbsp;(119 Mb; ARATH) chromosomes, reconstructing correctly all chromosomes always for CANPA and nearly always for ARATH (<a href="https://www.dropbox.com/sh/bb7lwggo40xrwtc/AAAZ7pByVQQQ-WhUXZVeJaZVa/pyScaf?dl=0">Figures in dropbox</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=2036953672">CANPA table</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=1920757821">ARATH table</a>).<br>Runs took ~0.5 min for CANPA on&nbsp;<code>4 CPUs</code>&nbsp;and ~2 min for ARATH on&nbsp;<code>16 CPUs</code>.</p>
<p><span>Important remarks:</span></p>
<ul>
<li>Reduce your assembly before (fasta2homozygous.py) as any redundancy will likely break the synteny.</li>
<li>pyScaf works better with contigs than scaffolds, as scaffolds are often affected by mis-assemblies (no&nbsp;<em>de novo assembler</em>&nbsp;/ scaffolder is perfect...), which breaks synteny.</li>
<li>pyScaf works very well if divergence between reference genome and assembled contigs is below 20% at nucleotide level.</li>
<li>pyScaf deals with large rearrangements ie. deletions, insertion, inversions, translocations.&nbsp;<span>Note however, this is experimental implementation!</span></li>
<li>Consider closing gaps after scaffolding.</li>
</ul><p>Address of the bookmark: <a href="https://github.com/lpryszcz/pyScaf" rel="nofollow">https://github.com/lpryszcz/pyScaf</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>

</channel>
</rss>