<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/27090?offset=180</link>
	<atom:link href="https://bioinformaticsonline.com/related/27090?offset=180" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27080/mrfast-micro-read-fast-alignment-search-tool</guid>
	<pubDate>Tue, 26 Apr 2016 03:50:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27080/mrfast-micro-read-fast-alignment-search-tool</link>
	<title><![CDATA[mrFAST:  Micro Read Fast Alignment Search Tool]]></title>
	<description><![CDATA[<p><span>mrFAST is a read mapper that is designed to map short reads to reference genome with a special emphasis on the discovery of structural variation and segmental duplications. mrFAST maps short reads with respect to user defined error threshold, including indels up to 4+4 bp. This manual, describes how to choose the parameters and tune mrFAST with respect to the library settings. mrFAST is designed to find&nbsp;</span><strong><span style="text-decoration: underline;">'all'</span></strong><span>&nbsp; mappings for a given set of reads, however it can return one "best" map location if the relevant parameter is invoked.</span></p>
<p><span>More at&nbsp;http://mrfast.sourceforge.net/manual.html</span></p><p>Address of the bookmark: <a href="http://mrfast.sourceforge.net/manual.html" rel="nofollow">http://mrfast.sourceforge.net/manual.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27331/andi</guid>
	<pubDate>Fri, 13 May 2016 05:16:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27331/andi</link>
	<title><![CDATA[Andi]]></title>
	<description><![CDATA[<p>This is the <code>andi</code> program for estimating the evolutionary distance between closely related genomes. These distances can be used to rapidly infer phylogenies for big sets of genomes. Because <code>andi</code> does not compute full alignments, it is so efficient that it scales even up to thousands of bacterial genomes.</p>
<p>This readme covers all necessary instructions for the impatient to get <code>andi</code> up and running. For extensive instructions please consult the <a href="https://github.com/EvolBioInf/andi/blob/master/andi-manual.pdf">manual</a>.</p>
<p>More at https://github.com/evolbioinf/andi/</p><p>Address of the bookmark: <a href="http://bioinformatics.oxfordjournals.org/content/early/2015/01/13/bioinformatics.btu815.full" rel="nofollow">http://bioinformatics.oxfordjournals.org/content/early/2015/01/13/bioinformatics.btu815.full</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/27799/bbmapbbtools-package-multipurpose-tool-designed-for-converting-reads-or-other-nucleotide-data-between-different-formats</guid>
	<pubDate>Mon, 13 Jun 2016 05:47:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/27799/bbmapbbtools-package-multipurpose-tool-designed-for-converting-reads-or-other-nucleotide-data-between-different-formats</link>
	<title><![CDATA[BBMap/BBTools package: Multipurpose tool designed for converting reads or other nucleotide data between different formats.]]></title>
	<description><![CDATA[<div id="post_message_148585"><a href="https://sourceforge.net/projects/bbmap/" target="_blank">Reformat</a>is a member of the <a href="https://sourceforge.net/projects/bbmap/" target="_blank">BBMap/BBTools package</a>. It is a multipurpose tool designed for converting reads or other nucleotide data between different formats. It supports, and can inter-convert:<br /> <br /> fastq<br /> fasta<br /> fasta+qual<br /> sam<br /> scarf (an old Illumina format)<br /> bam (if samtools is installed)<br /> gzip<br /> zip<br /> ascii-33 (sanger)<br /> ascii-64 (old Illumina)<br /> paired files<br /> interleaved files<br /> <br /> It is multithreaded and can process data at over 500 megabytes per second, and can accept streams from standard in and write to standard out, allowing it to be easily dropped into the middle of a pipeline for format conversion. Reformat autodetects formats based on file extensions and content, making it very easy to use; and the autodetection can be overridden, allowing flexibility for people who don't like to follow naming conventions, or out-of-spec fastq files with qualities values like -17 or 120.<br /> <br /> The program has been gradually expanded, and can now perform various other functions. None of these will break pairing, if the input is paired.<br /> <br /> Quality trimming (either or both ends)<br /> Quality filtering<br /> Fixed-length trimming<br /> Generation of histograms (base composition, quality, etc)<br /> Subsampling (to a fraction of input reads, or an exact number of reads or bases)<br /> Changing fasta line-wrapping length<br /> Reverse-complementing (all reads or only read 2)<br /> Adding /1 and /2 suffix to read names<br /> GC-content filtering<br /> Length-filtering<br /> Testing for corrupted interleaved files<br /> <br /> Reformat is compatible with any platform that supports Java 1.7 or higher. It also has a bash shellscript for simpler invocation. Typical usage examples:<br /> <br /> Reformat fastq into fasta:<br /> <strong>reformat.sh in=x.fq out=y.fa</strong><br /> <br /> Interleave paired reads:<br /> <strong>reformat.sh in1=x1.fq in2=x2.fq out=y.fq</strong><br /> <br /> Note - you can actually use a shortcut if paired read files have the same name with a 1 and a 2. This is equivalent to the above command:<br /> <strong>reformat.sh in=x#.fq out=y.fq</strong><br /> <br /> De-interleave reads:<br /> <strong>reformat.sh in=x.fq out1=y1.fq out2=y2.fq</strong><br /> <br /> Verify that interleaving appears correct, assuming Illumina namimg conventions:<br /> <strong>reformat.sh in=x.fq vint</strong><br /> <br /> Convert ASCII-33 to ASCII-64:<br /> <strong>reformat.sh in=x.fq out=y.fq qin=33 qout=64</strong><br /> <br /> Quality-trim paired reads to Q10 on the left and right ends and discard reads shorter than 50bp after trimming:<br /> <strong>reformat.sh in1=x1.fq in2=x2.fq out1=y1.fq out2=y2.fq outsingle=singletons.fq qtrim=rl trimq=10 minlength=50</strong><br /> <br /> Subsample 10% of the first 20000 pairs in an interleaved file:<br /> <strong>reformat.sh in=x.fq out=y.fq reads=20000 samplerate=0.1 int=t</strong><br /> (in this case "int=t" overrides interleaving autodetection, to ensure reads are treated as pairs)<br /> <br /> Pipe in a gzipped sam file and pipe out fasta:<br /> <strong>reformat.sh in=stdin.sam.gz out=stdout.fa</strong><br /> <br /> Reverse-complement reads:<br /> <strong>reformat.sh in=x.fq out=y.fq rcomp</strong><br /> <br /> For reformatting a file with very long sequences, Reformat will need more memory; just add the additional flag "-Xmx2g". For example, to change the line-wrapping length on the human genome (which has individual sequences over 200Mbp long) to 70 characters:<br /> <strong>reformat.sh -Xmx2g in=HG19.fa.gz out=HG19_wrapped.fa.gz fastawrap=70</strong><br /> <br /> For additional functions, please run the shellscript with no arguments, or just read it with a text editor. If you have any questions, please post them in this thread.<br /> <br /> For people using a non-bash terminal, you may need to type "bash reformat.sh" instead of just "reformat.sh".<br /> For users of Windows or other platforms that do not support bash shellscripts, replace "reformat.sh" with "java -ea -Xmx200m /path/to/bbmap/current/ jgi.ReformatReads"<br /> for example,<br /> <strong>java -ea -Xmx200m C:\bbmap\current\ jgi.ReformatReads in=x.fq out=y.fa</strong><br /> <br /> Reformat can be downloaded with BBTools here:<br /> <a href="https://sourceforge.net/projects/bbmap/" target="_blank">https://sourceforge.net/projects/bbmap/</a></div>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27967/linux-command-line-exercises-for-ngs-data-processing</guid>
	<pubDate>Wed, 22 Jun 2016 07:59:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27967/linux-command-line-exercises-for-ngs-data-processing</link>
	<title><![CDATA[Linux command line exercises for NGS data processing]]></title>
	<description><![CDATA[<p>The purpose of this tutorial is to introduce students to the frequently used tools for NGS analysis as well as giving experience in writing one-liners. Copy the required files to your current directory, change directory (<code>cd</code>) to the <code>linuxTutorial</code> folder, and do all the processing inside:</p>
<pre><span>[uzi@quince-srv2 ~/]$</span> cp -r /home/opt/MScBioinformatics/linuxTutorial .
<span>[uzi@quince-srv2 ~/]$</span> cd linuxTutorial
<span>[uzi@quince-srv2 ~/linuxTutorial]$</span>
</pre>
<p>I have deliberately chosen <code>Awk</code> in the exercises as it is a language in itself and is used more often to manipulate NGS data as compared to the other command line tools such as <code>grep</code>, <code>sed</code>, <code>perl</code> etc. Furthermore, having a command on <code>awk</code> will make it easier to understand advanced tutorials such as <a href="http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/Illumina_workflow.html">Illumina Amplicons Processing Workflow</a>. <br><br> In <code>Linux</code>, we use a shell that is a program that takes your commands from the keyboard and gives them to the operating system. Most Linux systems utilize Bourne Again SHell (<code>bash</code>), but there are several additional shell programs on a typical Linux system such as <code>ksh</code>, <code>tcsh</code>, and <code>zsh</code>. To see which shell you are using, type</p>
<pre><span>[uzi@quince-srv2 ~/linuxTutorial]$</span> echo $SHELL

<span>/bin/bash
</span></pre><p>Address of the bookmark: <a href="http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/linux.html" rel="nofollow">http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/linux.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29103/genome-strip</guid>
	<pubDate>Tue, 06 Sep 2016 03:58:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29103/genome-strip</link>
	<title><![CDATA[Genome STRiP]]></title>
	<description><![CDATA[<p><strong>Genome STRiP</strong><span>&nbsp;(Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.</span><br><br><span>Genome STRiP looks both across and within a set of sequenced genomes to detect variation. The methods are adaptive and support heterogeneous data sets, including variations in sequencing depth, read lengths and mixtures of paired and single-end reads. A minimum of 20 to 30 genomes are required to get acceptable results, but the method gains power across genomes and processing more genomes provide better results.</span><br><br><span>To run discovery or genotyping on a single sequenced genome or a small set of genomes, you need to call your data against a background population, such as a set of genomes from the 1000 Genomes Project.&nbsp; The background population does not need to be matched to the target individuals.</span></p><p>Address of the bookmark: <a href="http://software.broadinstitute.org/software/genomestrip/" rel="nofollow">http://software.broadinstitute.org/software/genomestrip/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</guid>
	<pubDate>Tue, 14 Jan 2020 06:47:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</link>
	<title><![CDATA[Shasta long read assembler]]></title>
	<description><![CDATA[<p>The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by&nbsp;<a href="https://nanoporetech.com/">Oxford Nanopore</a>&nbsp;flow cells.</p>
<p>Computational methods used by the Shasta assembler include:</p>
<ul>
<li>Using a&nbsp;<a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length</a>&nbsp;representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer repeat counts, which are the most common type of errors in Oxford Nanopore reads.</li>
<li>Using in some phases of the computation a representation of the read sequence based on&nbsp;<em>markers</em>, a fixed subset of short k-mers (k &asymp; 10).</li>
</ul>
<p>More at&nbsp;<a href="https://chanzuckerberg.github.io/shasta/index.html">https://chanzuckerberg.github.io/shasta/index.html</a></p><p>Address of the bookmark: <a href="https://github.com/chanzuckerberg/shasta" rel="nofollow">https://github.com/chanzuckerberg/shasta</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28855/vcfr</guid>
	<pubDate>Fri, 19 Aug 2016 07:38:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28855/vcfr</link>
	<title><![CDATA[vcfR]]></title>
	<description><![CDATA[<p><span>Most variant calling pipelines result in files containing large quantities of variant information. The&nbsp;</span><a href="http://samtools.github.io/hts-specs/" title="VCF format at hts-specs">variant call format (vcf)</a><span>&nbsp;is an increasingly popular format for this data. The format of these files and their content is discussed in the vignette &lsquo;vcf data.&rsquo; These files are typically intended to be post-processed (i.e., filtered) as an attempt to remove false positives or otherwise problematic sites. The R package vcfR provides tools to facilitate this filtering as well as to visualize the effects of choices made during this process.</span></p><p>Address of the bookmark: <a href="https://cran.r-project.org/web/packages/vcfR/vignettes/visualization_1.html" rel="nofollow">https://cran.r-project.org/web/packages/vcfR/vignettes/visualization_1.html</a></p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28915/useful-bioinformatics-tools</guid>
	<pubDate>Mon, 29 Aug 2016 04:08:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28915/useful-bioinformatics-tools</link>
	<title><![CDATA[Useful Bioinformatics Tools]]></title>
	<description><![CDATA[<p>Collections of few handy tools for bioinformatician</p>
<p>http://molbiol-tools.ca/Convert.htm</p><p>Address of the bookmark: <a href="http://molbiol-tools.ca/Convert.htm" rel="nofollow">http://molbiol-tools.ca/Convert.htm</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29123/artemis-comparison-tool-act</guid>
	<pubDate>Wed, 07 Sep 2016 03:54:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29123/artemis-comparison-tool-act</link>
	<title><![CDATA[Artemis Comparison Tool (ACT)]]></title>
	<description><![CDATA[<p><span>ACT is a Java application for displaying pairwise comparisons between two or more DNA sequences. It can be used to identify and analyse regions of similarity and difference between genomes and to explore conservation of synteny, in the context of the entire sequences and their annotation.&nbsp;It can read complete EMBL,&nbsp;GENBANK and GFF entries or sequences in FASTA or raw format.&nbsp;</span></p><p>Address of the bookmark: <a href="http://www.sanger.ac.uk/science/tools/artemis-comparison-tool-act" rel="nofollow">http://www.sanger.ac.uk/science/tools/artemis-comparison-tool-act</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</guid>
	<pubDate>Sat, 01 Oct 2016 14:45:02 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</link>
	<title><![CDATA[Nemo – A stochastic, individual-base, genetically explicit simulation platform]]></title>
	<description><![CDATA[<ul>
<li>
<p>A&nbsp;<strong>recombination map</strong>&nbsp;has been added for all multi-locus traits. The map positions (chromosomal) for neutral markers (e.g. SNPs) and loci under selection (QTLs, deleterious mutations, DMIs) can now be specified explicitly, or set at random. The map can hold an unlimited number of loci of different types jointly, at any recombination scale (cM or lower). The effects of linkage can thus be finely explored.</p>
</li>
<li>
<p>A new trait coding for (Bateson-)<strong>Dobzhansky-Muller incompatibility loci</strong>. Multiple haploid or diploid pairs of incompatible loci can be spread throughout the genome and affect individual fitness.</p>
</li>
<li>
<p><strong>Multi-type selection</strong>:&nbsp;<a href="http://nemo2.sourceforge.net/classIndividual.html" title="This class contains traits along with other individual information (sex, pedigree, etc. ).">Individual</a>&nbsp;fitness can be jointly determined by different types of loci under selectinon, such as QTLs coding for quantitative traits under spatially variable selection, universally deleterious mutations, and Dobzhansky-Muller incompatibility loci.</p>
</li>
<li>
<p><strong>An unlimited number of quantitative traits</strong>&nbsp;under different forms of selection can be modelled, based on universally pleiotropic loci with several bi- or multi-allelic models.</p>
</li>
<li>
<p><strong>Spatial and temporal variation of selection</strong>&nbsp;on quantitative traits is possible, modelling shifts of environmental conditions over time.</p>
</li>
<li>
<p>The dispersal matrix describing the movement of individuals among sub-populations can be replaced by a connectivity matrix and a reduced dispersal matrix describing migration only among the connected sub-populations. This offers a substantial gain in computing time and system memory when simulating very large grids.</p>
</li>
<li>
<p>Input parameters' arguments may be specified in separate files. This is particularly convenient when specifying large matrices.</p>
</li>
<li>
<p>Many adjustments have been made for refined control of the input of parameters and data output. See updates in the manual.</p>
</li>
</ul><p>Address of the bookmark: <a href="http://nemo2.sourceforge.net/index.html" rel="nofollow">http://nemo2.sourceforge.net/index.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>