<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/26325?offset=130</link>
	<atom:link href="https://bioinformaticsonline.com/related/26325?offset=130" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27818/gaemr</guid>
	<pubDate>Tue, 14 Jun 2016 06:18:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27818/gaemr</link>
	<title><![CDATA[GAEMR]]></title>
	<description><![CDATA[<p>The&nbsp;<span>G</span>enome&nbsp;<span>A</span>ssembly&nbsp;<span>E</span>valuation&nbsp;<span>M</span>etrics and&nbsp;<span>R</span>eporting (GAEMR) package is an assembly analysis framework composed a number of integrated modules. These modules can be executed as a single program to generate a complete analysis report, or executed individually to generate specific charts and tables. GAEMR standardizes input by converting a variety of read types to Binary Alignment Map (BAM) format, allowing a single input format to be entered into GAEMR&rsquo;s analysis pipeline, hence enabling the generation of standard reports.</p>
<p>GAEMR&rsquo;s analysis philosophy is centered on contiguity, correctness, and completeness -- how many pieces in an assembly composed of, how well those pieces accurately represent the genome sequenced, and how much of that genome is represented by those pieces. By performing over twenty different analyses based on these principles, GAEMR gives a clear picture of the condition of a genome assembly.&nbsp;</p><p>Address of the bookmark: <a href="https://www.broadinstitute.org/software/gaemr/" rel="nofollow">https://www.broadinstitute.org/software/gaemr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29130/gage-genome-assembly-gold-standard-evaluation</guid>
	<pubDate>Wed, 07 Sep 2016 07:35:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29130/gage-genome-assembly-gold-standard-evaluation</link>
	<title><![CDATA[GAGE : Genome Assembly Gold-standard Evaluation]]></title>
	<description><![CDATA[<p><span>GAGE is an evaluation of the very latest large-scale genome assembly algorithms. We have organized this "bake-off" as an attempt to produce a realistic assessment of genome assembly software in a rapidly changing field of next-generation sequencing. The main results of GAGE have now been published in the journal Genome Research:&nbsp;</span><a href="http://genome.cshlp.org/content/early/2012/01/12/gr.131383.111">GAGE: A critical evaluation of genome assemblies and assembly algorithms</a><span>.</span></p>
<p><span>http://genome.cshlp.org/content/early/2012/01/12/gr.131383.111</span></p><p>Address of the bookmark: <a href="http://gage.cbcb.umd.edu/index.html" rel="nofollow">http://gage.cbcb.umd.edu/index.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30012/swalo</guid>
	<pubDate>Wed, 30 Nov 2016 05:06:05 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30012/swalo</link>
	<title><![CDATA[SWALO]]></title>
	<description><![CDATA[<p>SWALO (scaffolding with assembly likelihood optimization) is a method for scaffolding based on likelihood of genome assemblies computed using generative models for sequencing.</p>
<p><a href="https://atifrahman.github.io/SWALO/swalo-0.9.7-beta.tar.gz"><strong>Download</strong></a></p>
<p><strong>Git repository of SWALO is at <a href="https://github.com/atifrahman/SWALO">https://github.com/atifrahman/SWALO</a>.</strong></p><p>Address of the bookmark: <a href="https://atifrahman.github.io/SWALO/" rel="nofollow">https://atifrahman.github.io/SWALO/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30093/velvet-tutorial</guid>
	<pubDate>Fri, 09 Dec 2016 04:19:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30093/velvet-tutorial</link>
	<title><![CDATA[Velvet tutorial]]></title>
	<description><![CDATA[<p><span>The objective of this activity is to help you understand how to run&nbsp;</span><a href="http://evomics.org/resources/software/genomics-software/assembly/velvet/" title="Velvet">Velvet</a><span>&nbsp;in general, how to accurately estimate the insert size of a paired-end library through the use of&nbsp;</span><a href="http://evomics.org/resources/software/genomics-software/assembly/bowtie/" title="Bowtie">Bowtie</a><span>, the primary parameters of velvet, and the process involved in producing a&nbsp;</span><em>de novo</em><span>&nbsp;assembly from Illumina reads.</span></p>
<p>http://evomics.org/learning/assembly-and-alignment/velvet/</p><p>Address of the bookmark: <a href="http://evomics.org/learning/assembly-and-alignment/velvet/" rel="nofollow">http://evomics.org/learning/assembly-and-alignment/velvet/</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30214/megamerge-a-tool-to-merge-assembled-contigs-long-reads-from-metagenomic-sequencing-runs</guid>
	<pubDate>Mon, 19 Dec 2016 09:42:15 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30214/megamerge-a-tool-to-merge-assembled-contigs-long-reads-from-metagenomic-sequencing-runs</link>
	<title><![CDATA[MeGAMerge: A tool to merge assembled contigs, long reads from metagenomic sequencing runs]]></title>
	<description><![CDATA[<p>MeGAMerge</p>
<p>MeGAMerge (A tool to merge assembled contigs, long reads from metagenomic sequencing runs)</p>
<p>Description</p>
<p>MeGAMerge is a perl based wrapper/tool that can accept any number of sequence (FASTA) files containing assembled contigs of any length in Multi-FASTA format to produce an improved contig set based on OLC based assembly. All overlap parameters (Minimum Overlap Length, Identity, etc) are user-declarable at runtime. It is written to run on Linux.</p>
<p>Requirements:</p>
<p>You will need to have the following tools installed and in $PATH, or added to $binpath in the tool:</p>
<p>Newbler (specifically runAssembly)<br>Minimus2 (part of AMOS, also requires MUMmer)</p><p>Address of the bookmark: <a href="https://github.com/LANL-Bioinformatics/MeGAMerge" rel="nofollow">https://github.com/LANL-Bioinformatics/MeGAMerge</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/30440/genome-assembly-tools-and-software-part2</guid>
	<pubDate>Tue, 27 Dec 2016 16:14:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/30440/genome-assembly-tools-and-software-part2</link>
	<title><![CDATA[Genome Assembly Tools and Software - PART2 !!]]></title>
	<description><![CDATA[<p>The genome assemblers generally take a file of short sequence reads and a file of quality-value as the input. Since the quality-value file for the high throughput short reads is usually highly memory-intensive, only a few assemblers, best suited for your assembly. For the sake of computational memory saving and convenience of data inquiry, high-throughput short reads data is always initially formatted to specific data structure. Currently, existing data structure for this usage can be predominantly classified into two categories: string-based model and graph-based model.</p><p>We therefore list many genomle assembly tools here. We mainly reported for the assembly of genomes while the others are designed aiming at handling complex genomes.</p><ul>
<li><a href="http://smithlabresearch.org/software/rmap/" title="RMAP 2.1 &ndash; Short-read Mapping">RMAP 2.1 &ndash; Short-read Mapping<br /></a><a href="http://smithlabresearch.org/software/rmap/" target="_blank">RMAP</a>&nbsp;is aimed to map accurately reads from the next-generation sequencing technology. RMAP can map reads with or without error probability information (quality scores) and supports paired-end reads or bisulfite-treated reads mapping. There is no limitaions on read widths or number of mismatches. RMAP can now map more than 8 million reads in an hour at full sensitivity to 2 mismatches<br /><br /></li>
<li><a href="https://sourceforge.net/p/mira-assembler/wiki/Home/" title="MIRA 4.0.2 &ndash; Whole Genome Shotgun and EST Sequence Assembler">MIRA 4.0.2 &ndash; Whole Genome Shotgun and EST Sequence Assembler<br /></a><a href="http://sourceforge.net/p/mira-assembler/wiki/Home/" target="_blank">MIRA</a>&nbsp;(Mimicking Intelligent Read Assembly)is a whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio (the later at the moment only CCS and error-corrected CLR reads). It can be seen as a Swiss army knife of sequence assembly developed and used in the past 12 years to get assembly jobs done efficiently &ndash; and especially accurately. That is, without actually putting too much manual work into finishing the assembly.<br /><br /></li>
<li><a href="http://www.brown.edu/Research/Istrail_Lab/hapcompass.php" title="HapCompass 0.7.7 &ndash; A Cycle-Basis Algorithm for Accurate Haplotype Assembly">HapCompass 0.7.7 &ndash; A Cycle-Basis Algorithm for Accurate Haplotype Assembly<br /></a><a href="http://www.brown.edu/Research/Istrail_Lab/hapcompass.php" target="_blank">HapCompass</a>&nbsp;for polyploid genomes can currently be used to create accurate pairwise SNP phasings.Given a set of aligned sequence reads in a SAM file and a set of variant calls in VCF format, HAPCOMPASS will assemble reads into haplotypes.<br /><br /></li>
<li><a href="http://www.csc.kth.se/~vezzi/software/" title="GAM-NGS 1.1b &ndash; Genome Assemblies Merger for Next Generation Sequencing">GAM-NGS 1.1b &ndash; Genome Assemblies Merger for Next Generation Sequencing<br /></a><a href="http://www.csc.kth.se/~vezzi/software/" target="_blank">GAM-NGS</a>&nbsp;is able to merge two or more assemblies and it rteturns an improved assembly (more contiguous and more correct). GAM-NGS shows its full potential with multi-library Illumina-based projects.<br /><br /></li>
<li><a href="http://omics.informatics.indiana.edu/GeneStitch/" title="GeneStitch 1.2.1 &ndash; Network Matching Algorithm to Gene Assembly">GeneStitch 1.2.1 &ndash; Network Matching Algorithm to Gene Assembly<br /></a><a href="http://omics.informatics.indiana.edu/GeneStitch/" target="_blank">GeneStitch</a>&nbsp;is a tool to assemble genes using network matching algorithm. Given an already-assembled dataset, it is capable of assembling contigs together to form more complete genes with the help of a reference gene set. Currently the assembly software that GeneStitch support is SOAPdenovo.<br /><br /></li>
<li><a href="http://bioen-compbio.bioen.illinois.edu/RACA/" title="RACA 0.9.1.1 &ndash; Reference-Assisted Chromosome Assembly">RACA 0.9.1.1 &ndash; Reference-Assisted Chromosome Assembly<br /></a><a href="http://bioen-compbio.bioen.illinois.edu/RACA/" target="_blank">RACA</a>&nbsp;is an algorithm to reliably order and orient sequence scaffolds generated by NGS and assemblers into longer chromosomal fragments using comparative genome information and paired-end reads.<br /><br /></li>
<li><a href="https://software.broadinstitute.org/software/discovar/blog/" title="DISCOVAR 51750 &ndash; Genome Shotgun Assembler and Variant Caller">DISCOVAR 51750 &ndash; Genome Shotgun Assembler and Variant Caller<br /></a><a href="http://www.broadinstitute.org/software/discovar/blog/" target="_blank">DISCOVAR</a>&nbsp;is a whole genome shotgun assembler and variant caller that can generate high quality assemblies and variant calls from the latest 250 base Illumina PCR-free fragment reads.<br /><br /></li>
<li><a href="http://www.seqan.de/projects/seqcons/" title="SeqCons 1.0 &ndash; de novo and reference-guided Sequence Assembly">SeqCons 1.0 &ndash; de novo and reference-guided Sequence Assembly<br /></a><a href="http://www.seqan.de/projects/seqcons/" target="_blank">&nbsp;SeqCons</a>&nbsp;(Sequence consensus) is an open source consensus computation program for Linux and Windows. The algorithm can be used for de novo and reference-guided sequence assembly.<br /><br /></li>
<li><a href="http://www.personal.psu.edu/jhm10/Vera/SoftwareC.html" title="SimAssemblyStage1/2 0.2 &ndash; Assembly Alignment of Contigs">SimAssemblyStage1/2 0.2 &ndash; Assembly Alignment of Contigs<br /></a><a href="http://www.personal.psu.edu/jhm10/Vera/SoftwareC.html" target="_blank">SimAssemblyStage1</a>: Perfectly aligns TranscriptSimulator reads to their nucleotide templates using read title inforamation, creating ideal simulated assembly of super contigs.<br /><br /></li>
<li><a href="http://www.csc.kth.se/~vezzi/software/" title="GapFiller &ndash; Closing the Gap within Paired Reads">GapFiller &ndash; Closing the Gap within Paired Reads<br /></a><a href="http://www.csc.kth.se/~vezzi/software/" target="_blank">GapFiller</a>&nbsp;is not a standard de novo assembler. It aims &ldquo;only&rdquo; at closing the gap between pairs of reads as a first step of a large number of downstream analysis<br /><br /></li>
<li><a href="http://www.sanger.ac.uk/science/tools/pagit" title="PAGIT 1.01 &ndash; Post Assembly Genome Improvement Toolkit">PAGIT 1.01 &ndash; Post Assembly Genome Improvement Toolkit<br /></a><a href="http://www.sanger.ac.uk/resources/software/pagit/" target="_blank">PAGIT</a>&nbsp;(Post Assembly Genome Improvement Toolkit) is a tools to generate automatically high quality sequence by ordering contigs, closing gaps, correcting sequence errors and transferring annotation.<br /><br /></li>
<li><a href="https://www.bsse.ethz.ch/cbg/software.html" title="ShoRAH 0.8.2 &ndash; Short Reads Assembly into Haplotypes">ShoRAH 0.8.2 &ndash; Short Reads Assembly into Haplotypes<br /></a><a href="http://www.bsse.ethz.ch/cbg/software/shorah" target="_blank">ShoRAH</a>&nbsp;is a software package that allows for inference about the structure of a population from a set of short sequence reads as obtained from ultra-deep sequencing of a mixed sample. The package contains programs that support mapping of reads to a reference genome, correcting sequencing errors by locally clustering reads in small windows of the alignment, reconstructing a minimal set of global haplotypes that explain the reads, and estimating the frequencies of the inferred haplotypes.<br /><br /></li>
<li><a href="http://www.genomics.cn/en/navigation/show_navigation?nid=2732" title="RePS 2.0 &ndash; WGS Sequence Assembler">RePS 2.0 &ndash; WGS Sequence Assembler<br /></a><a href="http://www.genomics.cn/en/navigation/show_navigation?nid=2732" target="_blank">RePS</a>&nbsp;(Repeat-masked Phrap with scaffolding), a WGS sequence assembler, that explicitly identifies exact kmer repeats from the shotgun data and removes them prior to the assembly. The established software Phrap is used to compute meaningful error probabilities for each base. Clone-end-pairing information is used to construct scaffolds that order and orient the contigs. The updated version of RePS incorporates some of the ideas introduced by Phusion on clustering<br /><br /></li>
<li><a href="http://bibiserv2.cebitec.uni-bielefeld.de/sessionTimeout.jsf" title="treecat &ndash; Phylogenetic Comparative Assembly">treecat &ndash; Phylogenetic Comparative Assembly<br /></a><a href="http://bibiserv2.cebitec.uni-bielefeld.de/cgcat?id=cgcat_treecat" target="_blank">treecat</a>&nbsp;(phylogenetic tree based contig arrangement tool) takes several genomes and their relationships in a phylogenetic tree into account to estimate a possible ordering of the contigs.<br /><br /></li>
<li><a href="http://alumni.cs.ucr.edu/~liw/isolasso.html" title="IsoLasso 2.6.1 &ndash; A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembly">IsoLasso 2.6.1 &ndash; A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembly<br /></a><a href="http://alumni.cs.ucr.edu/~liw/isolasso.html" target="_blank">IsoLasso</a>&nbsp;is an algorithm to assemble transcripts and estimate their expression levels from RNA-Seq reads.<br /><br /></li>
<li><a href="http://alumni.cs.ucr.edu/~liw/cem.html" title="CEM 0.9.1 &ndash; Transcriptome Assembly and Isoform Expression Level Estimation from Biased RNA-Seq Reads">CEM 0.9.1 &ndash; Transcriptome Assembly and Isoform Expression Level Estimation from Biased RNA-Seq Reads<br /></a><a href="http://alumni.cs.ucr.edu/~liw/cem.html" target="_blank">CEM</a>&nbsp;is an algorithm to assemble transcripts and estimate their expression levels from RNA-Seq reads.<br /><br /></li>
<li><a href="http://alan.cs.gsu.edu/NGS/?q=malta" title="MaLTA &ndash; Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq data">MaLTA &ndash; Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq data<br /></a><a href="http://alan.cs.gsu.edu/NGS/?q=malta" target="_blank">MaLTA</a>&nbsp;is a method for simultaneous transcriptome assembly and quantification from Ion Torrent RNA-Seq data.<br /><br /></li>
<li><a href="http://amos.sourceforge.net/wiki/index.php/AMOS" title="AMOS 3.1.0 &ndash; Whole Genome Shotgun Assembler">AMOS 3.1.0 &ndash; Whole Genome Shotgun Assembler<br /></a><a href="http://amos.sourceforge.net/wiki/index.php/AMOS" target="_blank">AMOS</a>&nbsp;(<strong>A</strong><strong>M</strong>odular,&nbsp;<strong>O</strong>pen-<strong>S</strong>ource)&nbsp;consortium is committed to the development of open-source whole genome assembly software. The project acronym (AMOS) represents our primary goal &mdash; to produce A Modular, Open-Source whole genome assembler.Open-source so that everyone is welcome to contribute and help build outstanding assembly tools, and modular in nature so that new contributions can be easily inserted into an existing assembly pipeline. This modular design will foster the development of new assembly algorithms and allow the AMOS project to continually grow and improve in hopes of eventually becoming a widely accepted and deployed assembly infrastructure. In this sense, AMOS is both a design philosophy and a software system.<br /><br /></li>
<li><a href="http://amos.sourceforge.net/wiki/index.php/AutoEditor" title="AutoEditor 1.20 &ndash; Automated Correction of Genome Sequence Errors">AutoEditor 1.20 &ndash; Automated Correction of Genome Sequence Errors<br /></a><a href="http://amos.sourceforge.net/wiki/index.php/AutoEditor" target="_blank">AutoEditor</a>&nbsp;is a tool for correcting sequencing and basecaller errors using sequence assembly and chromatogram data. On average AutoEditor corrects 80% of erroneous base calls, with an accuracy of 99.99%.This in turn improves the overall accuracy of genome sequences and facilitates the use of these sequences for polymorphism discovery.<br /><br /></li>
<li><a href="http://www.csd.uwo.ca/~ilie/SAGE/" title="SAGE &ndash; String Graph Assembly of GEnomes">SAGE &ndash; String Graph Assembly of GEnomes<br /></a><a href="http://www.csd.uwo.ca/~ilie/SAGE/" target="_blank">SAGE</a>&nbsp;is a new string-overlap graph-based de novo genome assembler.<br /><br /></li>
<li><a href="http://omega.omicsbio.org/" title="Omega 1.0.2 &ndash; Overlap-graph de novo Assembler for Metagenomics">Omega 1.0.2 &ndash; Overlap-graph de novo Assembler for Metagenomics<br /></a><a href="http://omega.omicsbio.org/" target="_blank">Omega</a>&nbsp;is a software for assembling and scaffolding Illumina sequencing data of microbial communities.<br /><br /></li>
<li><a href="http://www.compgenome.org/TCGA-Assembler/" title="TCGA-Assembler 1.0.3 &ndash; Open-Source Software for Retrieving and Processing TCGA Data">TCGA-Assembler 1.0.3 &ndash; Open-Source Software for Retrieving and Processing TCGA Data<br /></a><a href="http://www.compgenome.org/TCGA-Assembler/" target="_blank">TCGA-Assembler</a>&nbsp;is an open-source, freely available tool that automatically downloads, assembles, and processes public The Cancer Genome Atlas (TCGA) data, to facilitate downstream data analysis by relieving investigators from the burdens of data preparation.<br /><br /></li>
<li><a href="http://sammate.sourceforge.net/" title="SAMMate 2.7.4 / assemblySAM 1.1 &ndash;  Processing Short Read Alignments in SAM/BAM format / RNA-Seq Assembly and Analysis">SAMMate 2.7.4 / assemblySAM 1.1 &ndash; Processing Short Read Alignments in SAM/BAM format / RNA-Seq Assembly and Analysis<br /></a>
<p><a href="http://sammate.sourceforge.net/" target="_blank">SAMMate</a>&nbsp;is an open source GUI software suite to process RNA-Seq data. It is composed of two modules: assemblySAM and SAMMate.</p>
<p>assemblySAM employs a novel method to localize and assemble RNA-seq reads into RNA transcript sequences.<br /><br /></p>
</li>
<li><a href="http://www.cs.tau.ac.il/~bchor/StringGraph/" title="StringGraph beta &ndash; String Graph Construction Using Incremental Hashing">StringGraph beta &ndash; String Graph Construction Using Incremental Hashing<br /></a><a href="http://www.cs.tau.ac.il/~bchor/StringGraph/" target="_blank">StringGraph</a>&nbsp;is a novel, hash based method for constructing the string graph.<br /><br /></li>
<li><a href="http://mindthegap.genouest.org/" title="MindTheGap 1.0.0 &ndash; Detection and Assembly of Insertion Variants">MindTheGap 1.0.0 &ndash; Detection and Assembly of Insertion Variants<br /></a><a href="http://mindthegap.genouest.org/" target="_blank">MindTheGap</a>&nbsp;is a software that performs detection and assembly of DNA insertion variants in NGS read datasets with respect to a reference genome.<br /><br /></li>
<li><a href="http://cbcb.umd.edu/software/metAMOS" title="MetAMOS 1.5rc3 &ndash; Metagenomic Assembly pipeline for AMOS">MetAMOS 1.5rc3 &ndash; Metagenomic Assembly pipeline for AMOS<br /></a><a href="http://cbcb.umd.edu/software/metAMOS" target="_blank">MetAMOS</a>&nbsp;is an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations.<br /><br /></li>
<li><a href="http://impact.crhc.illinois.edu/projects.aspx#tiger" title="TIGER &ndash; DNA Sequence Assembly">TIGER &ndash; DNA Sequence Assembly<br /></a><a href="http://impact.crhc.illinois.edu/projects.aspx#tiger" target="_blank">Tiger</a>&nbsp;is a novel de novo assembly framework &nbsp;which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems.<br /><br /></li>
<li><a href="https://github.com/baoe/AlignGraph" title="AlignGraph &ndash; Secondary de novo Genome Assembly guided by closely related References">AlignGraph &ndash; Secondary de novo Genome Assembly guided by closely related References<br /></a><a href="https://github.com/baoe/AlignGraph" target="_blank">AlignGraph</a>&nbsp;is a software that extends and joins contigs or scaffolds by reassembling them with help provided by a reference genome of a closely related organism.<br /><br /></li>
<li><a href="http://compbio.cs.toronto.edu/hapsembler/scarpa.html" title="scarpa 0.241 &ndash; Scaffolding Reads with Practical Algorithms">scarpa 0.241 &ndash; Scaffolding Reads with Practical Algorithms<br /></a><a href="http://compbio.cs.toronto.edu/hapsembler/scarpa.html" target="_blank">Scarpa</a>&nbsp;is a stand-alone scaffolding tool for NGS data. It can be used together with virtually any genome assembler and any NGS read mapper that supports SAM format. Other features include support for multiple libraries and an option to estimate insert size distributions from data.<br /><br /></li>
<li><a href="http://genetics.cs.ucla.edu/vga/" title="VGA v1 &ndash; Viral Genome Assembler">VGA v1 &ndash; Viral Genome Assembler<br /></a><a href="http://genetics.cs.ucla.edu/vga/" target="_blank">VGA</a>&nbsp;is a method for accurate assembly of a heterogeneous viral population consisting of individuals viral genomes (also known as quasi-species).<br /><br /></li>
<li><a href="https://cbcl.ics.uci.edu//doku.php/software#genomix" title="Genomix 0.2.11 &ndash; Parallel Genome Assembly using Hyracks">Genomix 0.2.11 &ndash; Parallel Genome Assembly using Hyracks<br /></a><a href="https://cbcl.ics.uci.edu//doku.php/software#genomix" target="_blank">Genomix</a>&nbsp;is a parallel genome assembly system built from the ground up with scalability in mind. It can assemble large and high-coverage genomes from fastq files in a short time and produces assemblies similar to Velvet or Ray in quality.<br /><br /></li>
<li><a href="http://shendurelab.github.io/LACHESIS/" title="LACHESIS &ndash; Genome Assembly with Contact Probability Maps">LACHESIS &ndash; Genome Assembly with Contact Probability Maps<br /></a><a href="http://shendurelab.github.io/LACHESIS/" target="_blank">LACHESIS</a>&nbsp;is method that exploits contact probability map data (e.g. from Hi-C) for chromosome-scale de novo genome assembly.<br /><br /></li>
<li><a href="http://www.cmbb.arizona.edu/?page_id=312" title="KGBassembler 1.2 &ndash; Karyotype-based Genome Assembler for Brassicaceae Species">KGBassembler 1.2 &ndash; Karyotype-based Genome Assembler for Brassicaceae Species<br /></a><a href="http://www.cmbb.arizona.edu/?page_id=312" target="_blank">KGBassembler</a>&nbsp;(Brassicaceae genome assembler) is a C++ based tool for assembling contigs and/or scaffolds to full chromosomes based on the karyotype maps of Brassicaceae species and without the need of genetic and physical maps.<br /><br /></li>
<li><a href="https://sourceforge.net/projects/autoassemblyd/" title="AutoAssemblyD 0.1 &ndash; Graphical User Interface system for several Genome Assembler">AutoAssemblyD 0.1 &ndash; Graphical User Interface system for several Genome Assembler<br />The&nbsp;</a><a href="http://sourceforge.net/projects/autoassemblyd/" target="_blank">AssemblyD</a>&nbsp;is a software which performed the local and remote genome assembly by several assemblers based on an XML Template which can replace the large command lines required by most assemblers.<a href="http://www.mybiosoftware.com/autoassemblyd-0-1-graphical-user-interface-system-for-several-genome-assembler.html" title="AutoAssemblyD 0.1 &ndash; Graphical User Interface system for several Genome Assembler"><br /><br /></a></li>
<li><a href="http://bio.cs.put.poznan.pl/programs/519227629dfb89a7fa000001" title="SR-ASM &ndash; DNA Assembly of the Short Sequences coming from 454 sequencer">SR-ASM &ndash; DNA Assembly of the Short Sequences coming from 454 sequencer<br /></a><a href="http://bio.cs.put.poznan.pl/programs/519227629dfb89a7fa000001" target="_blank">SR-ASM</a>&nbsp;(Short Reads ASseMbly) algorithm is designed for DNA assembly of the short sequences coming from 454 sequencers.<a href="http://www.mybiosoftware.com/sr-asm-dna-assembly-short-sequences-coming-454-sequencer.html" title="SR-ASM &ndash; DNA Assembly of the Short Sequences coming from 454 sequencer"><br /><br /></a></li>
<li><a href="http://www.bx.psu.edu/miller_lab/" title="YASRA 2.33 &ndash; Yet Another Short Read Assembler">YASRA 2.33 &ndash; Yet Another Short Read Assembler<br /></a><a href="http://www.bx.psu.edu/miller_lab/" target="_blank">YASRA</a>&nbsp;performs comparative assembly of short reads using a reference genome, which can differ substantially from the genome being sequenced.<a href="http://www.mybiosoftware.com/yasra-2-32-short-read-assembler.html" title="YASRA 2.33 &ndash; Yet Another Short Read Assembler"><br /><br /></a></li>
<li><a href="http://derisilab.ucsf.edu/software/price/index.html" title="PRICE 1.2 &ndash; de novo Genome Assembler">PRICE 1.2 &ndash; de novo Genome Assembler<br /></a><a href="http://derisilab.ucsf.edu/software/price/index.html" target="_blank">PRICE</a>&nbsp;(Paired-Read Iterative Contig Extension) is a de novo genome assembler implemented in C++. Its name describes the strategy that it implements for genome assembly: PRICE uses paired-read information to iteratively increase the size of existing contigs. Initially, those contigs can be individual reads from a subset of the paired-read dataset, non-paired reads from sequencing technologies that provide non-paired data, or contigs that were output from a prior run of PRICE or any other&nbsp;<a href="http://www.mybiosoftware.com/price-0-18-de-novo-genome-assembler.html" title="PRICE 1.2 &ndash; de novo Genome Assembler"><br /><br /></a></li>
<li><a href="https://sc932.github.com/ALE/" title="ALE 20130717 &ndash; Assembly Likelihood Estimator">ALE 20130717 &ndash; Assembly Likelihood Estimator<br /></a><a href="http://sc932.github.com/ALE/" target="_blank">ALE</a>&nbsp;is a probabalistic framework for determining the likelihood of an assembly given the data (raw reads) used to assemble it. It allows for the rapid discovery of errors and comparisons between similar assemblies.<a href="http://www.mybiosoftware.com/ale-assembly-likelihood-estimator.html" title="ALE 20130717 &ndash; Assembly Likelihood Estimator"><br /><br /></a></li>
<li><a href="https://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE" title="SSPACE 3.0 &ndash; Scaffolding pre-assembled Contigs using Paired-read data">SSPACE 3.0 &ndash; Scaffolding pre-assembled Contigs using Paired-read data<br /></a><a href="http://www.baseclear.com/lab-products/bioinformatics-tools/sspace-standard/" target="_blank">SSPACE</a>&nbsp;(SSAKE-based Scaffolding of Pre-Assembled Contigs after Extension) is a stand-alone program for scaffolding pre-assembled contigs using paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (FASTA or FASTQ). The final scaffolds are provided in FASTA format.<a href="http://www.mybiosoftware.com/sspace-1-2-scaffolding-pre-assembled-contigs-paired-read-data.html" title="SSPACE 3.0 &ndash; Scaffolding pre-assembled Contigs using Paired-read data"><br /><br /></a></li>
<li><a href="http://www.sanger.ac.uk/science/tools/image" title="IMAGE 2.4.1 &ndash; Iterative Mapping and Assembly for Gap Elimination">IMAGE 2.4.1 &ndash; Iterative Mapping and Assembly for Gap Elimination<br /></a><a href="http://www.sanger.ac.uk/resources/software/pagit/#IMAGE" target="_blank">IMAGE</a>&nbsp;( Iterative Mapping and Assembly for Gap Elimination) is a software designed to close gaps in any draft assembly using Illumina paired end reads. IMAGE is best described in several stages: aligning of Illumina reads at contig ends; local assembly of reads into new contigs; reference contigs are extended or merged; iterating the whole process to extend and merge more contigs.<a href="http://www.mybiosoftware.com/image-2-3-iterative-mapping-assembly-gap-elimination.html" title="IMAGE 2.4.1 &ndash; Iterative Mapping and Assembly for Gap Elimination"><br /><br /></a></li>
<li><a href="https://www.hgsc.bcm.edu/software/atlas-gapfill" title="ATLAS GapFill 2.2 &ndash; Deals with the Repetitive Gap Assembly problem">ATLAS GapFill 2.2 &ndash; Deals with the Repetitive Gap Assembly problem<br /></a><a href="https://www.hgsc.bcm.edu/software/atlas-gapfill" target="_blank">ATLAS GapFill</a>&nbsp;deals with the repetitive gap assembly problem by using the unique gap-flanking sequences to group reads and convert the problem to a local assembly task. Localizing the assembly reduces the numbers of repeats in the assembly, allows more data to be incorporated, and allows for gaps to be filled.<a href="http://www.mybiosoftware.com/atlas-gapfill-2-2-deals-repetitive-gap-assembly-problem.html" title="ATLAS GapFill 2.2 &ndash; Deals with the Repetitive Gap Assembly problem"><br /><br /></a></li>
<li><a href="https://www.hgsc.bcm.edu/software/atlas-whole-genome-assembly-suite" title="Atlas 2005 &ndash; Whole Genome Assembly Suite">Atlas 2005 &ndash; Whole Genome Assembly Suite<br /></a><a href="https://www.hgsc.bcm.edu/software/atlas-whole-genome-assembly-suite" target="_blank">Atlas</a>&nbsp;is a collection of software tools to facilitate the assembly of large genomes from whole genome shotgun reads, or a combination of whole genome shotgun reads and BAC or other localized reads.<a href="http://www.mybiosoftware.com/atlas-2005-genome-assembly-suite.html" title="Atlas 2005 &ndash; Whole Genome Assembly Suite"><br /><br /></a></li>
<li><a href="http://bio.math.berkeley.edu/cgal/" title="CGAL 0.9.6b &ndash; Computing Genome Assembly Likelihoods">CGAL 0.9.6b &ndash; Computing Genome Assembly Likelihoods<br /></a><a href="http://bio.math.berkeley.edu/cgal/" target="_blank">CGAL</a>&nbsp;is a tool for computing genome assembly likelihoods. It computes the likelihood of reads with respect to the assembly and a statistical model which can be used as a metric for evaluating assemblies.<a href="http://www.mybiosoftware.com/cgal-0-9-6-computing-genome-assembly-likelihoods.html" title="CGAL 0.9.6b &ndash; Computing Genome Assembly Likelihoods"><br /><br /></a></li>
<li><a href="https://github.com/lh3/fermi" title="Fermi 1.1 &ndash; WGS de novo Assembler based on the FMD-index for large Genomes">Fermi 1.1 &ndash; WGS de novo Assembler based on the FMD-index for large Genomes<br /></a><a href="https://github.com/lh3/fermi" target="_blank">Fermi</a>&nbsp;is a de novo assembler for Illumina reads from whole-genome short-gun sequencing. It also provides tools for error correction, sequence-to-read alignment and comparison between read sets. It uses the FMD-index, a novel compressed data structure, as the key data&nbsp;<a href="http://www.mybiosoftware.com/fermi-1-1-wgs-de-novo-assembler-based-on-the-fmd-index-for-large-genomes.html" title="Fermi 1.1 &ndash; WGS de novo Assembler based on the FMD-index for large Genomes"><br /><br /></a></li>
<li><a href="http://pasha.sourceforge.net/homepage.htm#latest" title="PASHA 1.0.10 &ndash; Parallelized Short Read Assembly">PASHA 1.0.10 &ndash; Parallelized Short Read Assembly<br /></a><a href="http://pasha.sourceforge.net/" target="_blank">PASHA</a>&nbsp;is a parallel short read assembler for large genomes using de Bruijn graphs. Taking advantage of both shared-memory multi-core CPUs and distributed-memory compute clusters, PASHA has demonstrated its potential to perform high-quality de-novo assembly of large genomes in reasonable time with modest computing resources. Our evaluation using three small real paired-end datasets shows that PASHA is able to produce better assemblies with comparable genome coverage and mis-assembly rates compared to three leading assemblers: Velvet, ABySS and SOAPdenovo. Moreover, PASHA achieves the fastest speed for all three datasets on a single CPU.<a href="http://www.mybiosoftware.com/pasha-1-0-5-parallelized-short-read-assembly.html" title="PASHA 1.0.10 &ndash; Parallelized Short Read Assembly"><br /><br /></a></li>
<li><a href="http://xgenovo.dna.bio.keio.ac.jp/" title="XGenovo &ndash; Extended Genovo Metagenomic Assembler by Incorporating Paired-End Information">XGenovo &ndash; Extended Genovo Metagenomic Assembler by Incorporating Paired-End Information<br /></a><a href="http://xgenovo.dna.bio.keio.ac.jp/" target="_blank">XGenovo</a>&nbsp;(Extended Genovo) is an extended genovo metagenomic assembler by incorporating paired-end information<a href="http://www.mybiosoftware.com/xgenovo-extended-genovo-metagenomic-assembler-by-incorporating-paired-end-information.html" title="XGenovo &ndash; Extended Genovo Metagenomic Assembler by Incorporating Paired-End Information"><br /><br /></a></li>
<li><a href="http://metavelvet.dna.bio.keio.ac.jp/" title="MetaVelvet 1.2.01 / MetaVelvet-SL &ndash; An Extension of Velvet Assembler to de novo Metagenomic Assembly / utilizing Supervised Learning">MetaVelvet 1.2.01 / MetaVelvet-SL &ndash; An Extension of Velvet Assembler to de novo Metagenomic Assembly / utilizing Supervised Learning<br /></a><a href="http://metavelvet.dna.bio.keio.ac.jp/" target="_blank">MetaVelvet</a>&nbsp;is an extension of Velvet assembler to de novo metagenome assembly from short sequence reads<a href="http://www.mybiosoftware.com/metavelvet-1-2-01-metavelvet-sl-an-extension-of-velvet-assembler-to-de-novo-metagenomic-assembly-utilizing-supervised-learning.html" title="MetaVelvet 1.2.01 / MetaVelvet-SL &ndash; An Extension of Velvet Assembler to de novo Metagenomic Assembly / utilizing Supervised Learning"><br /><br /></a></li>
<li><a href="http://www.genomic.ch/edena.php" title="Edena v3.131028 &ndash; De Novo Short Reads Assembler">Edena v3.131028 &ndash; De Novo Short Reads Assembler<br /></a><a href="http://www.genomic.ch/edena.php" target="_blank">Edena</a>&nbsp;is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer<a href="http://www.mybiosoftware.com/edena-v3-dev110920-de-novo-short-reads-assembler.html" title="Edena v3.131028 &ndash; De Novo Short Reads Assembler"><br /><br /></a></li>
<li><a href="https://github.com/gramarga/ConPADE" title="ConPADE 1.00 &ndash; Contig Ploidy and Allele Dosage Estimation">ConPADE 1.00 &ndash; Contig Ploidy and Allele Dosage Estimation<br /></a><a href="http://research.microsoft.com/en-us/downloads/62815951-4b89-47a5-9e3d-7054182dafbb/default.aspx" target="_blank">ConPADE</a>&nbsp;is a tool used to estimate contig ploidy and allele dosage in polyploid genome assemblies.<a href="http://www.mybiosoftware.com/conpade-1-00-contig-ploidy-and-allele-dosage-estimation.html" title="ConPADE 1.00 &ndash; Contig Ploidy and Allele Dosage Estimation"><br /><br /></a></li>
<li><a href="https://sourceforge.net/projects/eloper/" title="ELOPER 1.2 &ndash; Elongation of Paired-end Reads for de novo Assembly">ELOPER 1.2 &ndash; Elongation of Paired-end Reads for de novo Assembly<br /></a><a href="http://sourceforge.net/projects/eloper/" target="_blank">ELOPER</a>&nbsp;is a pre-processing tool for pair-end sequences that produces a better read library for assembly programs.<a href="http://www.mybiosoftware.com/eloper-1-2-elongation-of-paired-end-reads-for-de-novo-assembly.html" title="ELOPER 1.2 &ndash; Elongation of Paired-end Reads for de novo Assembly"><br /><br /></a></li>
<li><a href="http://www.ebi.ac.uk/~zerbino/oases/" title="Oases 0.2.08 &ndash; De novo Transcriptome Assembler for very short reads">Oases 0.2.08 &ndash; De novo Transcriptome Assembler for very short reads<br /></a><a href="http://www.ebi.ac.uk/~zerbino/oases/" target="_blank">Oases</a>&nbsp;designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events, and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo&nbsp;<a href="http://www.mybiosoftware.com/oases-0-2-06-de-novo-transcriptome-assembler-short-reads.html" title="Oases 0.2.08 &ndash; De novo Transcriptome Assembler for very short reads"><br /><br /></a></li>
<li><a href="http://www.physics.rutgers.edu/~anirvans/SOPRA/" title="SOPRA 1.4.6 &ndash; Statistical Optimization of Paired Read Assembly">SOPRA 1.4.6 &ndash; Statistical Optimization of Paired Read Assembly<br /></a><a href="http://www.physics.rutgers.edu/~anirvans/SOPRA/" target="_blank">SOPRA</a>&nbsp;is an assembler for mate pair/paired-end reads from high throughput sequencing platforms, e.g. Illumina and SOLiD.<a href="http://www.mybiosoftware.com/sopra-1-4-6-statistical-optimization-paired-read-assembly.html" title="SOPRA 1.4.6 &ndash; Statistical Optimization of Paired Read Assembly"><br /><br /></a></li>
<li><a href="http://rnc.r.dendai.ac.jp/hapAssembly.html" title="hapAssembly &ndash; Haplotype Assembly from Whole-Genome Sequence Data">hapAssembly &ndash; Haplotype Assembly from Whole-Genome Sequence Data<br /></a><a href="http://rnc.r.dendai.ac.jp/hapAssembly.html" target="_blank">hapAssembly</a>&nbsp;&nbsp;beats the previous best for the important Haplotype Assembly Problem. It is&nbsp;an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model.<a href="http://www.mybiosoftware.com/hapassembly-haplotype-assembly-whole-genome-sequence-data.html" title="hapAssembly &ndash; Haplotype Assembly from Whole-Genome Sequence Data"><br /><br /></a></li>
<li><a href="https://code.google.com/archive/p/pbsim/" title="PBSIM 1.0.3 &ndash; PacBio Reads Simulator">PBSIM 1.0.3 &ndash; PacBio Reads Simulator<br /></a>PacBio sequencers produced two types of characteristic reads: CCS (short and low error rate) and CLR (long and high error rate), both of which could be useful for de novo assembly of genomes.&nbsp;<a href="https://code.google.com/p/pbsim/" target="_blank">PBSIM</a>&nbsp;simulates those PacBio reads by using either a model-based or sampling-based simulation.<a href="http://www.mybiosoftware.com/pbsim-1-0-3-pacbio-reads-simulator.html" title="PBSIM 1.0.3 &ndash; PacBio Reads Simulator"><br /><br /></a></li>
<li><a href="http://marte.ic.unicamp.br:8747/" title="SIS &ndash; Generate Draft Genome Sequence Scaffolds for Prokaryotes">SIS &ndash; Generate Draft Genome Sequence Scaffolds for Prokaryotes<br /></a><a href="http://marte.ic.unicamp.br:8747/" target="_blank">SIS</a>&nbsp;(Scaffolds from Inversion Signatures)is a new easy-to-use tool to generate contig scaffolds<a href="http://www.mybiosoftware.com/sis-generate-draft-genome-sequence-scaffolds-prokaryotes.html" title="SIS &ndash; Generate Draft Genome Sequence Scaffolds for Prokaryotes"><br /><br /></a></li>
<li><a href="https://www.cs.helsinki.fi/group/scaffold/normalizedN50/" title="NN50-calculator 0.5 &ndash; Evaluate the Correctness of Genome Assemblies">NN50-calculator 0.5 &ndash; Evaluate the Correctness of Genome Assemblies<br /></a><a href="http://www.cs.helsinki.fi/group/scaffold/normalizedN50/" target="_blank">NN50-calculator</a>&nbsp;(Normalized N50 calculator) is a tool for evaluating the correctness of genome assemblies.<a href="http://www.mybiosoftware.com/nn50-calculator-0-5-evaluate-correctness-genome-assemblies.html" title="NN50-calculator 0.5 &ndash; Evaluate the Correctness of Genome Assemblies"><br /><br /></a></li>
<li><a href="http://josephryan.github.io/baa.pl/" title="Baa.pl 0.20 &ndash; use BLAT to ASSESS an ASSEMBLY">Baa.pl 0.20 &ndash; use BLAT to ASSESS an ASSEMBLY<br /></a><a href="http://josephryan.github.io/baa.pl/" target="_blank">Baa.pl</a>&nbsp;is a simple script that parses the output of a BLAT run of a transcriptome vs. a genome assembly.<a href="http://www.mybiosoftware.com/baa-pl-0-10-blat-assess-assembly.html" title="Baa.pl 0.20 &ndash; use BLAT to ASSESS an ASSEMBLY"><br /><br /></a></li>
<li><a href="http://compbio.cs.toronto.edu/hapsembler/index.html" title="hapsembler 2.21 &ndash; Haplotype-specific Genome Assembly Toolkit">hapsembler 2.21 &ndash; Haplotype-specific Genome Assembly Toolkit<br /></a><a href="http://compbio.cs.toronto.edu/hapsembler/index.html" target="_blank">Hapsembler</a>&nbsp;is a haplotype-specific genome assembly toolkit that is designed for genomes that are rich in SNPs and other types of polymorphism. Hapsembler can be used to assemble reads from a variety of platforms including Illumina and Roche/454.<a href="http://www.mybiosoftware.com/hapsembler-2-1-haplotype-specific-genome-assembly-toolkit.html" title="hapsembler 2.21 &ndash; Haplotype-specific Genome Assembly Toolkit"><br /><br /></a></li>
<li><a href="http://alan.cs.gsu.edu/NGS/?q=content/vispa" title="ViSpA 02 &ndash; Viral Spectrum Assembler">ViSpA 02 &ndash; Viral Spectrum Assembler<br /></a><a href="http://alan.cs.gsu.edu/NGS/?q=content/vispa" target="_blank">ViSpA</a>&nbsp;(Viral Spectrum Assembling) implements a novel viral assembling and frequency estimation methods. This software uses a simple error correction, viral variants assembling based on maximum-bandwidth paths in weighted read graphs and frequency estimation via Expectation Maximization on all reads.<a href="http://www.mybiosoftware.com/vispa-01-viral-spectrum-assembler.html" title="ViSpA 02 &ndash; Viral Spectrum Assembler"><br /><br /></a></li>
<li><a href="http://www.vicbioinformatics.com/software.velvetoptimiser.shtml" title="VelvetOptimiser 2.2.5 &ndash; Automatically Optimise Velvet Assembler Parameters">VelvetOptimiser 2.2.5 &ndash; Automatically Optimise Velvet Assembler Parameters<br /></a><a href="http://www.vicbioinformatics.com/software.velvetoptimiser.shtml" target="_blank">VelvetOptimiser</a>&nbsp;is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler.<a href="http://www.mybiosoftware.com/velvetoptimiser-2-2-5-automatically-optimise-velvet-assembler-parameters.html" title="VelvetOptimiser 2.2.5 &ndash; Automatically Optimise Velvet Assembler Parameters"><br /><br /></a></li>
<li><a href="http://www.vicbioinformatics.com/software.assemblet.shtml" title="Assemblet 0.1 &ndash; Antigenic Variation Assembler">Assemblet 0.1 &ndash; Antigenic Variation Assembler<br /></a><a href="http://www.vicbioinformatics.com/software.assemblet.shtml" target="_blank">Assemblet</a>&nbsp;is a short read assembler for assembling antigenic variant sequences in bacteria.<a href="http://www.mybiosoftware.com/assemblet-0-1-antigenic-variation-assembler.html" title="Assemblet 0.1 &ndash; Antigenic Variation Assembler"><br /><br /></a></li>
<li><a href="http://www.vicbioinformatics.com/software.velvetk.shtml" title="VelvetK 20120606 &ndash; Find a reasonable K-mer size to Assemble Genome Reads with Velvet">VelvetK 20120606 &ndash; Find a reasonable K-mer size to Assemble Genome Reads with Velvet<br /></a><a href="http://www.vicbioinformatics.com/software.velvetk.shtml" target="_blank">VelvetK</a>&nbsp;can estimate the best k-mer size to use for your Velvet de novo assembly. It needs two inputs: the estimated genome size, and all your sequence read files. The genome size can be supplied as as a number (eg. 3.5M) or as a FASTA file of a closely related genome.<a href="http://www.mybiosoftware.com/velvetk-20120606-find-reasonable-k-mer-size-assemble-genome-reads-velvet.html" title="VelvetK 20120606 &ndash; Find a reasonable K-mer size to Assemble Genome Reads with Velvet"><br /><br /></a></li>
<li><a href="http://www.vicbioinformatics.com/software.vague.shtml" title="VAGUE 1.0.5 &ndash; Velvet Assembler Graphical User Environment">VAGUE 1.0.5 &ndash; Velvet Assembler Graphical User Environment<br /></a><a href="http://www.vicbioinformatics.com/software.vague.shtml" target="_blank">VAGUE</a>&nbsp;(Velvet Assembler Graphical Front End) is a GUI for the&nbsp;<a href="http://www.mybiosoftware.com/assembly-tools/3852">Velvet</a>&nbsp;de novo assembler.<a href="http://www.mybiosoftware.com/vague-1-0-5-velvet-assembler-graphical-user-environment.html" title="VAGUE 1.0.5 &ndash; Velvet Assembler Graphical User Environment"><br /><br /></a></li>
<li><a href="http://pritchardlab.stanford.edu/software.html" title="Transcriptome Assembler &ndash; Transcriptome Assembly used in RNA-seq of 16 Mammalian Species">Transcriptome Assembler &ndash; Transcriptome Assembly used in RNA-seq of 16 Mammalian Species<br /></a><a href="http://pritchardlab.stanford.edu/software.html" target="_blank">Transcriptome Assembler</a>&nbsp;is a software for transcriptome assembly used in RNA-seq of 16 mammalian species.<a href="http://www.mybiosoftware.com/transcriptome-assembler-transcriptome-assembly-rna-seq-16-mammalian-species.html" title="Transcriptome Assembler &ndash; Transcriptome Assembly used in RNA-seq of 16 Mammalian Species"><br /><br /></a></li>
<li><a href="http://bio.codeplex.com/wikipage?title=sequenceassembler&amp;referringTitle=sampleapps&amp;ANCHOR#sampleapps" title="BioSequenceAssembler 2.0 &ndash; Microsoft Research Sequence Assembler">BioSequenceAssembler 2.0 &ndash; Microsoft Research Sequence Assembler<br /></a><a href="http://bio.codeplex.com/wikipage?title=sequenceassembler&amp;referringTitle=sampleapps&amp;ANCHOR#sampleapps" target="_blank">BioSequenceAssembler</a>&nbsp;is intended for use by biologist and laboratory technicians who are responsible for managing next-generation genomic sequencing data for alignment, assembly, and/or BLAST identification.<a href="http://www.mybiosoftware.com/biosequenceassembler-2-0-microsoft-research-sequence-assembler.html" title="BioSequenceAssembler 2.0 &ndash; Microsoft Research Sequence Assembler"><br /><br /></a></li>
<li><a href="http://www.imperial.ac.uk/bioinformatics-data-science-group" title="BugBuilder &ndash; Microbial Genome Assembly">BugBuilder &ndash; Microbial Genome Assembly<br /></a><a href="http://www3.imperial.ac.uk/bioinfsupport/resources/software/bugbuilder" target="_blank">BugBuilder</a>&nbsp;is a pipeline for the automated assembly and annotation of microbial genomes from high-throughput sequence data. It is configurable so as not to be tied to any assembler or scaffolder, and is designed to run in a cluster environment facilitating high-throughput processing of genomes.<a href="http://www.mybiosoftware.com/bugbuilder-microbial-genome-assembly.html" title="BugBuilder &ndash; Microbial Genome Assembly"><br /></a></li>
<li><a href="http://maximuspipeline.sourceforge.net/main/">MAXIMUS 0.2 &ndash; Hybrid Reference and de novo Assembly pipeline</a><br /><a href="http://maximuspipeline.sourceforge.net/main/" target="_blank">MAXIMUS</a>&nbsp;is a genome assembly pipeline which takes the best out of multiple reference assemblies and de novo assembly. The benefits of this approach include better assembled repetitive regions, less gaps and higher accuracy for the resultant assembly.<a href="http://www.mybiosoftware.com/maximus-0-2-hybrid-reference-de-novo-assembly-pipeline.html" title="MAXIMUS 0.2 &ndash; Hybrid Reference and de novo Assembly pipeline"><br /><br /></a></li>
<li><a href="http://www.bcgsc.ca/about/pubann/the-issake-short-read-sequence-assembly-approach-for-profiling-t-cell-metagenomes" title="ISSAKE &ndash; Short Read Sequence Assembly">ISSAKE &ndash; Short Read Sequence Assembly<br /></a><a href="http://www.bcgsc.ca/about/pubann/the-issake-short-read-sequence-assembly-approach-for-profiling-t-cell-metagenomes" target="_blank">iSSAKE</a>&nbsp;(immuno-SSAKE) is a sequencing approach and assembly software for profiling T-cell metagenomes using short reads from the massively parallel sequencing platforms.<a href="http://www.mybiosoftware.com/issake-short-read-sequence-assembly.html" title="ISSAKE &ndash; Short Read Sequence Assembly"><br /><br /></a></li>
<li><a href="http://www.animalgenome.org/tools/beap/" title="IDBA / IDBA-UD 1.1.1 &ndash; De Bruijn Graph De Novo Assembler with Highly Uneven Sequencing Depth">IDBA / IDBA-UD 1.1.1 &ndash; De Bruijn Graph De Novo Assembler with Highly Uneven Sequencing Depth<br /></a><a href="http://i.cs.hku.hk/~alse/hkubrg/projects/idba/index.html" target="_blank">&nbsp;IDBA</a>&nbsp;is a practical iterative De Bruijn Graph De Novo Assembler for sequence assembly in bioinfomatics. Most assemblers based on de Bruijn graph build a de Bruijn graph with a specific k to perform the assembling task. For all of them, it is very crucial to find a specific value of k. If k is too large, there will be a lot of gap problems in the graph. If k is too small, there will a lot of branch problems. IDBA uses not only one specific k but a range of k values to build the iterative de Bruijn graph. It can keep all the information in graphs with different k values. So, it will perform better than other assemblers.<a href="http://www.mybiosoftware.com/idba-ud-1-09-de-bruijn-graph-de-novo-assembler-highly-uneven-sequencing-depth.html" title="IDBA / IDBA-UD 1.1.1 &ndash; De Bruijn Graph De Novo Assembler with Highly Uneven Sequencing Depth"><br /><br /></a></li>
<li><a href="https://code.google.com/archive/p/est2assembly/" title="est2assembly 1.13 &ndash; Assembly and Annotation of Transcriptomes for any Species">est2assembly 1.13 &ndash; Assembly and Annotation of Transcriptomes for any Species<br />The&nbsp;</a><a href="https://code.google.com/p/est2assembly/" target="_blank">est2assembly</a>&nbsp;platform is the only platform for standardising transcriptome projects: go from raw trace files to an annotated GBrowse interface driven by the Seqfeature database. It accepts both Sanger and 454 sequencing technology for a denovo assembly, annotation and data mining of EST data.<a href="http://www.mybiosoftware.com/est2assembly-1-13-assembly-annotation-transcriptomes-species.html" title="est2assembly 1.13 &ndash; Assembly and Annotation of Transcriptomes for any Species"><br /><br /></a></li>
<li><a href="https://code.google.com/archive/p/curtain/" title="Curtain 0.2.3 beta &ndash; Assembling large Genomes from Short Read Sequences">Curtain 0.2.3 beta &ndash; Assembling large Genomes from Short Read Sequences<br /></a><a href="https://code.google.com/p/curtain/" target="_blank">Curtain</a>&nbsp;is an assembler of next generation sequence. Curtain is a Java wrapper around next-generation assemblers such as Velvet, which allows the incremental introduction of read-pair information into the assembly process.<a href="http://www.mybiosoftware.com/curtain-0-2-3-beta-assembling-large-genomes-short-read-sequences.html" title="Curtain 0.2.3 beta &ndash; Assembling large Genomes from Short Read Sequences"><br /><br /></a></li>
<li><a href="http://www.comp.nus.edu.sg/~bioinfo/peasm/PE_manual.htm" title="PEAssember 1.2 &ndash; A de novo Genome Assembler">PEAssember 1.2 &ndash; A de novo Genome Assembler<br /></a><a href="http://www.comp.nus.edu.sg/~bioinfo/peasm/PE_manual.htm" target="_blank">PEAssember</a>&nbsp;is a parallel de novo genome assembler for small &ndash; mid sized genomes.<a href="http://www.mybiosoftware.com/peassember-1-2-de-novo-genome-assembler.html" title="PEAssember 1.2 &ndash; A de novo Genome Assembler"><br /><br /></a></li>
<li><a href="https://sourceforge.net/projects/contrail-bio/" title="Contrail 0.8.2 &ndash; Assembly of Large Genomes using Cloud Computing">Contrail 0.8.2 &ndash; Assembly of Large Genomes using Cloud Computing<br /></a><a href="http://contrail-bio.sourceforge.net/" target="_blank">Contrail</a>&nbsp;is a Hadoop based genome assembler for assembling large genomes in the clouds<a href="http://www.mybiosoftware.com/contrail-0-8-2-assembly-large-genomes-cloud-computing.html" title="Contrail 0.8.2 &ndash; Assembly of Large Genomes using Cloud Computing"><br /><br /></a></li>
<li><a href="http://www.mybiosoftware.com/beap-0-6-beta-blast-extension-assembly-program.html" title="BEAP 0.6 beta &ndash; Blast Extension and Assembly Program">BEAP 0.6 beta &ndash; Blast Extension and Assembly Program<br />The&nbsp;</a><a href="http://www.animalgenome.org/tools/beap/" target="_blank">BEAP</a>&nbsp;is a computer program that uses a short starting DNA fragment, often a EST or partial gene segment, as &ldquo;primer&rdquo;, to recursively blast nucleotide databases in an attempt to obtain all sequences that overlaps, directly or indirectly, with the &ldquo;primer&rdquo; therefore help to &ldquo;extend&rdquo; the length of the original sequence for constructing a &ldquo;full length&rdquo; sequence for functional analysis, or at least to obtain neighboring regions of the segment for SNP discovery and linkage disequilibrium&nbsp;<a href="http://www.mybiosoftware.com/beap-0-6-beta-blast-extension-assembly-program.html" title="BEAP 0.6 beta &ndash; Blast Extension and Assembly Program"><br /><br /></a></li>
<li><a href="http://manuals.bioinformatics.ucr.edu/home/branch" title="BRANCH 1.8.1 &ndash; boosting RNA-Seq Assemblies with Partial or related Genomic Sequences">BRANCH 1.8.1 &ndash; boosting RNA-Seq Assemblies with Partial or related Genomic Sequences<br /></a><a href="http://manuals.bioinformatics.ucr.edu/home/branch" target="_blank">BRANCH</a>&nbsp;is a software that extends de novo transfrags and identifies novel transfrags with DNA contigs or genes of close related species. BRANCH discovers novel exons first and then extends/joins fragmented de novo transfrags, so that the resulted transfrags are more complete.<a href="http://www.mybiosoftware.com/branch-1-8-1-boosting-rna-seq-assemblies-partial-related-genomic-sequences.html" title="BRANCH 1.8.1 &ndash; boosting RNA-Seq Assemblies with Partial or related Genomic Sequences"><br /><br /></a></li>
<li><a href="http://www.cbcb.umd.edu/software/quake/">Quake 0.3.5 &ndash; Detect &amp; Correct Substitution Sequencing Errors in WGS Data Sets</a><br />
<p><a href="http://www.cbcb.umd.edu/software/quake/" target="_blank">Quake</a>&nbsp;is a package to correct substitution sequencing errors in experiments with deep coverage (e.g. &gt;15X), specifically intended for Illumina sequencing reads. Quake adopts the k-mer error correction framework, first introduced by the EULER genome assembly package. Unlike EULER and similar progams, Quake utilizes a robust mixture model of erroneous and genuine k-mer distributions to determine where errors are located. Then Quake uses read quality values and learns the nucleotide to nucleotide error rates to determine what types of errors are most likely. This leads to more corrections and greater accuracy, especially with respect to avoiding mis-corrections,&nbsp;&nbsp;which create false sequence unsimilar to anything in the original genome sequence from which the read was taken.</p>
</li>
<li><a href="http://www.ebi.ac.uk/~zerbino/velvet/" title="Velvet 1.2.10 &ndash; Sequence Assembler for Very Short Reads">Velvet 1.2.10 &ndash; Sequence Assembler for Very Short Reads<br /></a><a href="http://www.ebi.ac.uk/~zerbino/velvet/" target="_blank">Velvet</a>&nbsp;is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454.Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs.<a href="http://www.mybiosoftware.com/velvet-1-1-07-sequence-assembler-short-reads.html" title="Velvet 1.2.10 &ndash; Sequence Assembler for Very Short Reads"><br /><br /></a></li>
<li><a href="http://www.complex.iastate.edu/download/Lucy2/index.html" title="Lucy 2.20 &ndash; DNA Sequence Quality &amp; Vector Trimming">Lucy 2.20 &ndash; DNA Sequence Quality &amp; Vector Trimming<br /></a><a href="http://www.complex.iastate.edu/download/Lucy2/index.html" target="_blank">Lucy</a>&nbsp;has been used for several years to clean sequence data from automated DNA sequencers prior to sequence assembly and other downstream uses. &nbsp;The quality trimming portion of lucy makes use of phred quality scores, such as those produced by many automated sequencers based on the Sanger sequencing method. &nbsp;As such, lucy&rsquo;s quality trimming may not be appropriate for sequence data produced by some of the new &ldquo;next-generation&rdquo; sequencers.<a href="http://www.mybiosoftware.com/lucy-2-19p-r8-dna-sequence-quality-vector-trimming.html" title="Lucy 2.20 &ndash; DNA Sequence Quality &amp; Vector Trimming"><br /><br /></a></li>
<li><a href="http://bioinfo.bti.cornell.edu/tool/iAssembler/">iAssembler 1.3.2 &ndash; de novo Assembly of Roche-454/Sanger Transcriptome Sequences</a><br /><a href="http://bioinfo.bti.cornell.edu/tool/iAssembler/" target="_blank">iAssembler</a>&nbsp;is a standalone package to assemble ESTs generated using Sanger and/or Roche-454 pyrosequencing technologies into contigs.<a href="http://www.mybiosoftware.com/iassembler-1-3-2-de-novo-assembly-roche-454sanger-transcriptome-sequences.html" title="iAssembler 1.3.2 &ndash; de novo Assembly of Roche-454/Sanger Transcriptome Sequences"><br /><br /></a></li>
<li><a href="http://www.broadinstitute.org/software/gaemr/" title="GAEMR 1.0.1 &ndash; Assembly Analysis Framework">GAEMR 1.0.1 &ndash; Assembly Analysis Framework<br /></a><a href="http://www.broadinstitute.org/software/gaemr/" target="_blank">GAEMR</a>&nbsp;(Genome Assembly Evaluation Metrics and Reportin) is a complete genome analysis package that helps you evaluate and report on a genome assembly&rsquo;s completeness, correctness, and contiguity.<a href="http://www.mybiosoftware.com/gaemr-1-0-1-assembly-analysis-framework.html" title="GAEMR 1.0.1 &ndash; Assembly Analysis Framework"><br /><br /></a></li>
<li><a href="https://mulcyber.toulouse.inra.fr/plugins/mediawiki/wiki/pyrocleaner/index.php/Main_Page" title="PyroCleaner 1.3 &ndash; Clean 454 Pyrosequencing Reads in order to ease the Assembly Process">PyroCleaner 1.3 &ndash; Clean 454 Pyrosequencing Reads in order to ease the Assembly Process<br />The&nbsp;</a><a href="https://mulcyber.toulouse.inra.fr/plugins/mediawiki/wiki/pyrocleaner/index.php/Main_Page" target="_blank">pyrocleaner</a>&nbsp;is intended to clean the reads included in the sff file in order to ease the assembly process. It enables filtering sequences on different criteria such as length, complexity, number of undetermined bases which has been proven to correlate with poor quality and multiple copy reads. It also enables to clean paired-ends sff files and generates on one side a sff with the validated paired-ends and on the other the sequences which can be used as shotgun reads.<a href="http://www.mybiosoftware.com/pyrocleaner-1-3-clean-454-pyrosequencing-reads-order-ease-assembly-process.html" title="PyroCleaner 1.3 &ndash; Clean 454 Pyrosequencing Reads in order to ease the Assembly Process"><br /><br /></a></li>
<li><a href="http://bioinformatics.rutgers.edu/Software/SLiQ/" title="SLiQ &ndash; Simple linear Inequalities based Mate-Pair reads Filtering and Scaffolding">SLiQ &ndash; Simple linear Inequalities based Mate-Pair reads Filtering and Scaffolding<br /></a><a href="http://bioinformatics.rutgers.edu/Software/SLiQ/" target="_blank">SLIQ&nbsp;</a>, a set of simple linear inequalities derived from the geometry of contigs on the line, can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a contig digraph.<a href="http://www.mybiosoftware.com/sliq-simple-linear-inequalities-based-mate-pair-reads-filtering-scaffolding.html" title="SLiQ &ndash; Simple linear Inequalities based Mate-Pair reads Filtering and Scaffolding"><br /><br /></a></li>
<li><a href="http://bioinf.spbau.ru/en/rectangles" title="rectangles 2.0 &ndash; Rectangle Graph for Repeat Resolution in Genome Assembly">rectangles 2.0 &ndash; Rectangle Graph for Repeat Resolution in Genome Assembly<br /></a><a href="http://bioinf.spbau.ru/en/rectangles" target="_blank">rectangles</a>&nbsp;is an ultimate tool for resolving repeats in genome assemblies.<a href="http://www.mybiosoftware.com/rectangles-2-0-rectangle-graph-repeat-resolution-genome-assembly.html" title="rectangles 2.0 &ndash; Rectangle Graph for Repeat Resolution in Genome Assembly"><br /><br /></a></li>
<li><a href="http://archive.broadinstitute.org/crd/wiki/index.php/Arachne_Main_Page" title="Arachne 4.6233 &ndash; Whole-genome Shotgun Assembler">Arachne 4.6233 &ndash; Whole-genome Shotgun Assembler<br /></a><a href="http://www.broadinstitute.org/crd/wiki/index.php/Arachne_Main_Page" target="_blank">ARACHNE</a>&nbsp;is a program for assembling data from whole genome shotgun sequencing experiments. It was designed for long reads from Sanger sequencing technology, and has been used extensively to assemble many genomes, including many that are large and highly repetitive.<a href="http://www.mybiosoftware.com/arachne-3-2-whole-genome-shotgun-assembler.html" title="Arachne 4.6233 &ndash; Whole-genome Shotgun Assembler"><br /><br /></a></li>
<li><a href="http://terpconnect.umd.edu/~ALEKSEYZ/PhrapUMDV2/" title="Reconciliator 2.0 &ndash; The tool for Merging Assemblies">Reconciliator 2.0 &ndash; The tool for Merging Assemblies<br /></a><a href="http://terpconnect.umd.edu/~ALEKSEYZ/PhrapUMDV2/" target="_blank">Reconciliator</a>&nbsp;is the tool for merging assemblies.<a href="http://www.mybiosoftware.com/reconciliator-2-0-tool-merging-assemblies.html" title="Reconciliator 2.0 &ndash; The tool for Merging Assemblies"><br /><br /></a></li>
<li><a href="http://terpconnect.umd.edu/~ALEKSEYZ/PhrapUMDV2/" title="PhrapUMD 2 &ndash; Modified version of Phrap">PhrapUMD 2 &ndash; Modified version of Phrap<br /></a><a href="http://www.glue.umd.edu/~ALEKSEYZ/PhrapUMDV2" target="_blank">Phrap UMD</a>&nbsp;consists of the UMD Trimmer, UMD Overlapper and a modified version of Phrap.It is capable of assembling data downloaded directly from the NCBI Trace Archive. The pipeline runs in 3 stages: &nbsp;first the vector ends of the reads are examined and the vector is found. &nbsp;Then the reads are trimmed for vector and quality. &nbsp;After that the trimmed reads afe fed into the 5-pass UMD Overlapper that finds the overlaps, corrects the base caller errors and performs additional trimming if necessary. &nbsp;After the overlaps are produced, the trimmed and error-corrected reads and overlaps are input into the modified version of Phrap, whichonly puts the reads together if they overlap according to the list of overlaps produced by the UMD Overlapper.<a href="http://www.mybiosoftware.com/phrapumd-2-modified-version-phrap.html" title="PhrapUMD 2 &ndash; Modified version of Phrap"><br /><br /></a></li>
<li><a href="http://www.dna-dragon.com/" title="DNA Dragon 1.5.6 build1 &ndash; DNA Sequence Contig Assembler Software">DNA Dragon 1.5.6 build1 &ndash; DNA Sequence Contig Assembler Software<br /></a><a href="http://www.dna-dragon.com/" target="_blank">DNA Dragon</a>&nbsp;Contig Assembler assembles sequences, trace data (ABI, SCF, AB1), Illumina and Roche 454 flowgrams into contigs. It is a very fast and accurate DNA sequence assembly software. The DNA sequences are assembled into contigs and a direct comparision of trace date with nucleotide data is possible. It also allows for proofreading and base editing.<a href="http://www.mybiosoftware.com/dna-dragon-1-2-7-dna-sequence-contig-assembler-software.html" title="DNA Dragon 1.5.6 build1 &ndash; DNA Sequence Contig Assembler Software"><br /></a></li>
</ul>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31353/concoct-clustering-contigs-with-coverage-and-composition</guid>
	<pubDate>Mon, 06 Mar 2017 04:08:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31353/concoct-clustering-contigs-with-coverage-and-composition</link>
	<title><![CDATA[CONCOCT: Clustering cONtigs with COverage and ComposiTion]]></title>
	<description><![CDATA[<p>A program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.</p>
<p>Warning! This software is to be considered under development. Functionality and the user interface may still change significantly from one version to another. If you want to use this software, please stay up to date with the list of known issues:<a href="https://github.com/BinPro/CONCOCT/issues">https://github.com/BinPro/CONCOCT/issues</a></p><p>Address of the bookmark: <a href="https://github.com/BinPro/CONCOCT" rel="nofollow">https://github.com/BinPro/CONCOCT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</guid>
	<pubDate>Fri, 08 Jun 2018 09:31:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</link>
	<title><![CDATA[Jvarkit : Java utilities for Bioinformatics]]></title>
	<description><![CDATA[Collection of Java tool kits for bioinformatics works:

Jvarkit : Java utilities for Bioinformatics<p>Address of the bookmark: <a href="http://lindenb.github.io/jvarkit/" rel="nofollow">http://lindenb.github.io/jvarkit/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/1469/prime-minister%E2%80%99s-100k-genome-project</guid>
	<pubDate>Thu, 08 Aug 2013 09:40:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/1469/prime-minister%E2%80%99s-100k-genome-project</link>
	<title><![CDATA[Prime Minister’s 100k Genome Project]]></title>
	<description><![CDATA[<p>Genomics Ebgland is destined to sequence 100,000 patients over the next five year in England.&nbsp; A landmark project by british government.</p><p>Genomics England will play a key role in building on the UK&rsquo;s long track record as leader in medical science advances to push the boundaries by unlocking the power of DNA data. The UK will become the first ever country to introduce this technology in its mainstream health system &ndash; leading the global race for better tests, better drugs and above all better, more personalised care.</p><p>http://www.genomicsengland.co.uk/100k-genome-project/</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/7674/useful-publications-and-websites-for-deep-sequencing-data-analysis</guid>
	<pubDate>Sun, 29 Dec 2013 22:30:45 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/7674/useful-publications-and-websites-for-deep-sequencing-data-analysis</link>
	<title><![CDATA[Useful Publications and Websites for Deep Sequencing Data Analysis]]></title>
	<description><![CDATA[<h3>Global overview papers</h3><p>Next generation quantitative genetics in plants. Jim&eacute;nez-G&oacute;mez, Frontiers in Plant Science 2:77, 2011 <span style="text-decoration: underline;"><a href="http://www.frontiersin.org/Plant_Physiology/10.3389/fpls.2011.00077/full">Full Text</a> </span><em>[equally relevant to animal and microbial systems]</em></p><p>Sense from sequence reads: methods for alignment and assembly. Flicek &amp; Birney, Nat Methods 6(11 Suppl):S6-S12, 2009. <a href="http://www.nature.com/nmeth/journal/v6/n11s/full/nmeth.1376.html"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Library construction and experimental design</h3><p>Statistical design and analysis of RNA sequencing data. Auer &amp; Doerge, Genetics 185(2):405-16, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881125"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Biases in Illumina transcriptome sequencing caused by random hexamer priming. Hansen et al., Nucleic Acids Res. 38(12): e131, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896536"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Aird et al, Genome Biology 12:R18, 2011 <a href="http://genomebiology.com/2011/12/2/R18"><span style="text-decoration: underline;">Full Text</span></a></p><p>Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of GC-biased genomes. Kozarewa et al, Nature Methods 6(4):291-5, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2664327/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Rohland &amp; Reich, Genome Research 22(5): 939&ndash;946. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337438/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><h3>Data formats, data management, and alignment software tools<span style="text-decoration: underline;"> </span></h3><p>The Sequence Alignment/Map format and SAMtools. Li et al, Bioinformatics 25(16):2078-9, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723002"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>SAM format specification <a href="http://samtools.sourceforge.net/SAM1.pdf"><span style="text-decoration: underline;">file</span></a></p><p>Efficient storage of high throughput sequencing data using reference-based compression. Fritz et al, Genome Res 21(5):734-40, 2011. <a href="http://genome.cshlp.org/content/21/5/734.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Compression of DNA sequence reads in FASTQ format. Deorowicz &amp; Grabowski, Bioinformatics 27(6):860-2, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21252073"><span style="text-decoration: underline;">PubMed</span></a></p><p>Fast and accurate short read alignment with Burrows-Wheeler transform. Li &amp; Durbin, Bioinformatics 25(14):1754-60, 2009. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Improving SNP discovery by base alignment quality. Li H, Bioinformatics 27(8):1157-8, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21320865"><span style="text-decoration: underline;">PubMed</span></a></p><p>BEDTools: a flexible suite of utilities for comparing genomic features. Quinlan and Hall, Bioinformatics 26:841-842, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/6/841.full.pdf+html"><span style="text-decoration: underline;">Publisher Website</span></a></p><h3>Data quality assessment, filtering, and correction</h3><p>SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. Cox et al, BMC Bioinformatics 11:485, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2956736"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>TileQC: a system for tile-based quality control of Solexa data. Dolan &amp; Denver, BMC Bioinformatics 9:250, 2008 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2443380"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[requires a reference sequence]</em></p><p>Quake: quality-aware detection and correction of sequencing errors. Kelley et al, Genome Biol 11(11):R116, 2010. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21114842"> <span style="text-decoration: underline;">PubMed</span></a></p><p>FastQC: a quality control tool for high-throughput sequence data. <a href="http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/"><span style="text-decoration: underline;">Home Page</span></a></p><p>FASTX-toolkit: FASTQ/A short-reads pre-processing tools <a href="http://hannonlab.cshl.edu/fastx_toolkit/"><span style="text-decoration: underline;">Home Page</span></a></p><p>Reference-free validation of short read data. Schr&ouml;der et al, PLoS One 5(9):e12681, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2943903"> <span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Correction of sequencing errors in a mixed set of reads. Salmela, Bioinformatics 26(10):1284, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/10/1284.long"><span style="text-decoration: underline;">Full Text</span></a> <em>[includes error correction of SOLiD reads in colorspace]</em></p><p>Repeat-aware modeling and correction of short read errors. Yang et al, BMC Bioinformatics 12(Supp1):S52, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044310"> <span style="text-decoration: underline;">PubMedCentral</span></a> <em>[requires a reference sequence]</em></p><p>HiTEC: accurate error correction in high-throughput sequencing data. Ilie et al, Bioinformatics 27(3):295, 2011 <a href="http://bioinformatics.oxfordjournals.org/content/27/3/295.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Error correction of high-throughput sequencing datasets with non-uniform coverage. Medvedev et al., Bioinformatics 27(13):i137-41, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386"><span style="text-decoration: underline;">PubMedCentral</span></a></p><h3>De novo assembly<span style="text-decoration: underline;"> </span></h3><p>Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Zerbino &amp; Birney, Genome Res 18(5):821-9, 2008. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2336801">u&gt;PubMedCentral</a></p><p>Assembly of large genomes using second-generation sequencing. Schatz et al, Genome Res 20(9):1165-73, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Gnerre et al, PNAS 108(4): 1513-18, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3029755"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Genome assembly has a major impact on gene content: a comparison of annotation in two <em>Bos taurus </em> assemblies. Florea&nbsp; et al., PLoS One 6(6):e21400, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3120881/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Carver et al, Bioinformatics 28(4):464 - 469, 2012 <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278759/">PubMedCentral</a></span></p><p>Efficient de novo assembly of large genomes using compressed data structures. Simpson &amp; Durbin, Genome Research 22:549-556, 2012 <span style="text-decoration: underline;"><a href="http://genome.cshlp.org/content/22/3/549.full">Full Text</a></span> <em>[Describes the String Graph Assembler (SGA), which assembled a human genome in less than 6 days using 54 Gb of RAM and a 123-processor compute cluster for calculation of an FM-index of the 1.2 billion reads]</em></p><p>Readjoiner: a fast and memory efficient string graph-based sequence assembler. Gonnella &amp; Kurtz, BMC Bioinformatics 13: 82, 2012 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507659"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Assemblathon 1: A competitive assessment of de novo short read assembly methods. Earl et al, Genome Research 21:2224-2241, 2011 <span style="text-decoration: underline;"><a href="http://genome.cshlp.org/content/early/2011/09/16/gr.126599.111.full.pdf+html">Full Text</a></span></p><h3>Chromatin immunoprecipation analysis: ChIP-seq</h3><p>ChIP-seq: advantages and challenges of a maturing technology. Park, Nat Rev Genet. 10:669-80, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3191340/"><span style="text-decoration: underline;">PubMed</span></a></p><p>ChIP-seq and Beyond: new and improved methodologies to detect and characterize protein-DNA interactions. Furey, Nat Rev Genet 13: 840&ndash;852, 2012 <a href="http://www.nature.com/nrg/journal/v13/n12/full/nrg3306.html"> <span style="text-decoration: underline;">Publisher Web Site</span></a></p><p>MuMoD: a Bayesian approach to detect multiple modes of protein&ndash;DNA binding from genome-wide ChIP data. Narlikar, Nucleic Acids Res 41:21&ndash;32, 2013 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592440/"><span style="text-decoration: underline;">PubMed</span></a></p><h3>Transcriptome analysis</h3><h3>Assembly and comparison to genome</h3><p>Full-length transcriptome assembly from RNA-Seq data without a reference genome. Grabherr et al, Nature Biotechnology 29:644 - 652, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21572440"><span style="text-decoration: underline;">PubMed</span></a> <em>[The software is called <a href="http://trinityrnaseq.sourceforge.net/"><span style="text-decoration: underline;">Trinity</span></a>, and is available on Sourceforge.]</em></p><p>Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Peng et al, Nature Biotechnology 30:253 - 260, 2012. <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pubmed/22327324">PubMed</a></span> <em>[Several comments on this paper question whether the reported differences are in fact evidence of editing or are simply sequencing errors - the authors stand by their conclusions, but the controversy demonstrates the importance of robust data analysis methods.] </em></p><p>Optimization of de novo transcriptome assembly from next-generation sequencing data. Surget-Groba &amp; Montoya-Burgos, Genome Res 20(10):1432-40, 2010. <a href="http://genome.cshlp.org/content/20/10/1432.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Rnnotator: an automated <em>de novo</em> transcriptome assembly pipeline from stranded RNA-Seq reads. Martin et al, BMC Genomics 11:663, 2010 <a href="http://www.biomedcentral.com/1471-2164/11/663"><span style="text-decoration: underline;">Full Text</span></a></p><p><em>De novo</em> assembly and analysis of RNA-seq data. Robertson et al, Nature Methods 7:909-912, 2010 <a href="http://www.nature.com/nmeth/journal/v7/n11/full/nmeth.1517.html"><span style="text-decoration: underline;">Full Text</span></a> <em>[describes Trans-ABySS, a pipeline to use the ABySS parallel assembler for de novo transcriptome analysis]</em></p><h3>Differential expression analysis</h3><p>R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data. Mittal &amp; McDonald, Nucleic Acids Res, 2012 <span style="text-decoration: underline;"><a href="http://nar.oxfordjournals.org/content/early/2012/01/28/nar.gks047.long">Full Text</a></span></p><p>Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Mercer et al, Nature Biotechnology 30:99 - 104, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nbt/journal/v30/n1/full/nbt.2024.html"> Publisher Website</a></span></p><p>Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Trapnell et al, Nature Protocols 7:562 - 578, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html"> Publisher Website</a></span></p><p>Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Łabaj et al, Bioinformatics 27:i383 - i391, 2011 <span style="text-decoration: underline;"><a href="http://bioinformatics.oxfordjournals.org/content/27/13/i383.full.pdf+html"> Full Text</a></span></p><p>Improving RNA-Seq expression estimates by correcting for fragment bias. Roberts et al, Genome Biol 12:R22, 2011 <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3129672/">PubMed Central</a></span></p><p>Cloud-scale RNA-sequencing differential expression analysis with Myrna. Langmead et al, Genome Biol 11:R83, 2010 <a href="http://genomebiology.com/2010/11/8/R83"><span style="text-decoration: underline;">Full Text</span></a></p><p>From RNA-seq reads to differential expression results. Oshlack et al, Genome Biol 11(12):220, 2010 <a href="http://genomebiology.com/content/11/12/220"><span style="text-decoration: underline;">Full Text</span></a></p><p>DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Wang et al., Bioinformatics. 26(1):136-8. 2010 <a href="http://www.ncbi.nlm.nih.gov/pubmed/19855105"><span style="text-decoration: underline;"> PubMed</span></a></p><p>DEseq: Differential expression analysis for sequence count data. Anders and Huber, Genome Biology 11:R106, 2010 <a href="http://genomebiology.com/2010/11/10/R106"><span style="text-decoration: underline;">Full Text</span></a></p><p>edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Robinson et al., Bioinformatics 26(1):139-40 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796818"> <span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Two-stage Poisson model for testing RNA-seq data. Auer and Doerge, SAGMB 10(1), article 26 <a href="http://www.bepress.com/sagmb/vol10/iss1/art26/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments. McCormick et al., Silence2(1):2, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3055805"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>RNA-Seq gene expression estimation with read mapping uncertainty. Li et al, Bioinformatics 26:493-500, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820677">PubMedCentral</a> <em>[describes the RSEM software package]</em></p><h3>Comparing genomes and assemblies; variant detection<span style="text-decoration: underline;"> </span></h3><p>Versatile and open software for comparing large genomes. Kurtz et al, Genome Biol (5(2):R12, 2004. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC395750"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[describes the MUMmer software for full-genome alignment &amp; comparisons]</em></p><p>Searching for SNPs with cloud computing. Langmead et al, Genome Biol 10(11):R134, 2009 <a href="http://genomebiology.com/content/10/11/R134"><span style="text-decoration: underline;">Full Text</span></a></p><p>Calling SNPs without a reference sequence. Ratan et al, BMC Bioinformatics 11:130, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851604"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Microindel detection in short-read sequence data. Krawitz et al, Bioinformatics 26(6):722-9, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/6/722.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>vipR: variant identification in pooled DNA using R. Altmann et al., Bioinformatics 27: i77-i84, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Geoseq: a tool for dissecting deep-sequencing datasets. Gurtowski et al, BMC Bioinformatics 11:506, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2972303/"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[Geoseq is a web service that allows searching deep sequencing datasets with a reference sequence of a gene of interest]</em></p><p>Detecting and annotating genetic variations using the HugeSeq pipeline. Lam et al, Nature Biotechnology 30:226 - 229, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nbt/journal/v30/n3/full/nbt.2134.html">Publisher Website</a></span>, <span style="text-decoration: underline;"><a href="http://hugeseq.snyderlab.org/">Home Page</a></span></p><p>Genome-wide LORE1 retrotransposon mutagenesis and high-throughput insertion detection in <em>Lotus japonicus</em>. Urbański et al, Plant J 64:731-741, 2012. <span style="text-decoration: underline;"><a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1365-313X.2011.04827.x/abstract">Publisher Website</a></span> <em>[This paper describes a 2-dimensional pooling strategy with barcoding to allow use of Illumina sequencing to screen for retrotransposon insertion mutations, and includes a software package called FSTpoolit for analysis of the resulting sequence reads.]</em></p><h3>Genotyping by sequencing</h3><p>Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Davey et al., Nat Rev Genet 12(7):499-510, 2011 <a href="http://www.ncbi.nlm.nih.gov/pubmed/21681211"><span style="text-decoration: underline;">PubMed</span></a> <em>[A review of methods available at the time]</em></p><p>A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. Elshire et al., PLoS One 6(5):e19379, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3087801"><span style="text-decoration: underline;">Full Text</span></a></p><p>Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. Poland et al., PLoS One 7(2): e32253, 2012. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3289635/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. Peterson et al, PLoS One 7(5):e37135, . 2012. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3365034/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Imputation of unordered markers and the impact on genomic selection accuracy. Rutkowski et al, G3 3(3):427-39, 2013. <a href="http://www.g3journal.org/content/3/3/427.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Diversity Arrays Technology (DArT) and next-generation sequencing combined: genome-wide, high-throughput, highly informative genotyping for molecular breeding of <em>Eucalyptus</em>. Sansaloni et al., BMC Proceedings 5(Suppl 7):P54, 2011 <span style="text-decoration: underline;"><a href="http://www.biomedcentral.com/1753-6561/5/S7/P54">Full Text</a></span></p><p>High-throughput genotyping by whole-genome resequencing. Huang et al., Genome Res 19(6):1068-76, 2009. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2694477"><span style="text-decoration: underline;">Full Text</span></a></p><p>Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Andolfatto et al. Genome Res 21(4):610-7, 2011. <a href="http://genome.cshlp.org/content/21/4/610.long"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Restriction-site Associated DNA (RAD) markers</h3><p>Rapid SNP discovery and genetic mapping using sequenced RAD markers. Baird et al, PLoS One 3(10):e3376, 2008 <span style="text-decoration: underline;"><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0003376">Full Text</a></span></p><p>Linkage mapping and comparative genomics using next-generation RAD sequencing of a non-model organism. Baxter et al., PLoS One 6(4):e19315, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3082572"><span style="text-decoration: underline;">Full Text</span></a></p><p>Genome evolution and meiotic maps by massively parallel DNA sequencing: spotted gar, an outgroup for the teleost genome duplication. Amores et al, Genetics 188(4):799-808, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21828280"><span style="text-decoration: underline;"> PubMed</span></a></p><p>Construction and application for QTL analysis of a Restriction-site Associated DNA (RAD) linkage map in barley. Chutimanitsakun et al, BMC Genomics 4; 12:4, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023751"><span style="text-decoration: underline;">Full Text</span></a></p><p>RAD tag sequencing as a source of SNP markers in <em>Cynara cardunculus </em>L. Scaglione et al., BMC Genomics 13:3, 2012. <span style="text-decoration: underline;"><a href="http://www.biomedcentral.com/1471-2164/13/3">Full Text</a></span></p><p>Paired-end RAD-seq for de novo assembly and marker design without available reference. Willing et al., Bioinformatics 27(16):2187-93, 2011. <a href="http://bioinformatics.oxfordjournals.org/content/27/16/2187.long"><span style="text-decoration: underline;">Publisher Website</span></a></p><p>Local de novo assembly of RAD paired-end contigs using short sequencing reads. Etter et al., PLOS ONE 6(4): e18561, 2011. <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0018561"><span style="text-decoration: underline;">Full Text</span></a></p><p>Stacks: building and genotyping loci de novo from short-read sequences. Catchen et al., G3: Genes, Genomes, Genetics, 1:171-182, 2011. <span style="text-decoration: underline;"> Full Text</span>, <a href="http://creskolab.uoregon.edu/stacks/"><span style="text-decoration: underline;">Home Page</span></a></p><p>Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads. Chong et al, Bioinformatics 28(21):2732-7, 2012. <a href="http://bioinformatics.oxfordjournals.org/content/28/21/2732.long"> <span style="text-decoration: underline;">Publisher Website</span></a></p><p>UK RAD Sequencing Wiki page, with bibliography and RADTools software download <a href="https://www.wiki.ed.ac.uk/display/RADSequencing/Home"><span style="text-decoration: underline;">Home Page</span></a></p><h3>Workspace environments</h3><p><span style="text-decoration: underline;">Papers</span></p><p>Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Goecks et al, Genome Biol 11(8):R86, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2945788"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Galaxy Cloudman: Delivering compute clusters. BMC Bioinformatics 11(Suppl. 12):S4, 2010 <a href="http://www.biomedcentral.com/content/pdf/1471-2105-11-S12-S4.pdf"><span style="text-decoration: underline;">Full Text</span></a></p><p><a href="http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit"><span style="text-decoration: underline;">The Genome Analysis Toolkit</span></a>: a MapReduce framework for analyzing next-generation DNA sequencing data. McKenna et al, Genome Res 20(9):1297-303, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928508"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>A framework for variation discovery and genotyping using next-generation DNA sequencing data. DePristo et al., Nat Genet 43(5):491-8, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21478889"><span style="text-decoration: underline;"> PubMed</span></a></p><p><span style="text-decoration: underline;">Online resources</span></p><p>The <a href="http://cran.r-project.org/"><span style="text-decoration: underline;">R statistical computing</span></a> environment includes<a href="http://www.bioconductor.org/"><span style="text-decoration: underline;"> Bioconductor</span></a>, a specialized set of tools for analysis of microarray and high-throughput sequencing data. Introductory materials from on-line or short workshops are widely available online; examples are <span style="text-decoration: underline;"><a href="http://bioconductor.org/help/course-materials/2012/Evomics2012/Bioconductor-tutorial.pdf">Evomics2012 Bioconductor-tutorial.pdf</a></span>, and <a href="http://bcb.dfci.harvard.edu/%7Eaedin/courses/Bioconductor/"><span style="text-decoration: underline;">Intro to Bioconductor</span></a>. Materials from an advanced course on high-throughput genetic data analysis are at <span style="text-decoration: underline;"><a href="http://bioconductor.org/help/course-materials/2012/SeattleFeb2012/">Seattle 2012 materials</a></span>. Thomas Girke of UC-Riverside has written a very complete set of manuals describing the use of R and Bioconductor for analysis of genomic datasets, available at <a href="http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual">R and Bioconductor Manuals</a>. <br /> <a href="http://cran.r-project.org/manuals.html"><span style="text-decoration: underline;">Manuals</span></a> and contributed <a href="http://cran.r-project.org/other-docs.html"><span style="text-decoration: underline;">documentation</span></a> for R are available at the R-project.org website, and video tutorials are also available on Youtube; those posted by Tutorlol are brief, clear, and to the point. <br /> Materials from a series of mini-courses in R taught in 2010 at UCLA are available:</p><ul>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0141/10S-basicR.pdf">Intro to programming and graphics</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0143/S10_RProgII.pdf">Data manipulation and functions</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0185/Graphics_course.pdf">Graphics for exploratory data analysis</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0147/20100503_IntroStats.pdf">Introductory statistics</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0188/reg_R_1_09S_slides.pdf">Linear regression</a></li>
</ul><p><a href="http://a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/"> <span style="text-decoration: underline;">A Little Book of R for Bioinformatics</span></a> is an on-line resource with information and exercises to provide practice in bioinformatics analysis of DNA sequences and other biological data in R. <br /> Many books on specific topics in R programming are also available through Amazon or other vendors.</p><h3>Cloud computing resources</h3><p>The case for cloud computing in genome informatics. Lincoln Stein, Genome Biol. 11(5):207, 2010 <a href="http://www.ncbi.nlm.nih.gov/pubmed/20441614"><span style="text-decoration: underline;">Pubmed</span></a></p><p>Galaxy Cloudman: delivering cloud compute clusters. Afgan et al, BMC Bioinformatics <span style="text-decoration: underline;">11</span>(Suppl 12):S4, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S4"><span style="text-decoration: underline;">Full Text</span></a></p><p><a href="http://cloudbiolinux.com/">CloudBioLinux</a> is an open-source project that provides a bioinformatics Linux system for cloud computing, pre-configured with a variety of software tools installed and ready to use.</p><p>A <a href="https://github.com/chapmanb/cloudbiolinux/blob/master/doc/intro/gettingStarted_CloudBioLinux.pdf?raw=true"><span style="text-decoration: underline;">tutorial</span></a> on getting started with CloudBioLinux on the Amazon Web Services Elastic Compute Cloud (EC2)</p><p><a href="http://userwww.service.emory.edu/%7Eeafgan/content/ppt/EnisAfgan_BOSC_2010.pdf"><span style="text-decoration: underline;">Deploying Galaxy on the Cloud</span></a>  slides from a presentation by Enis Afgan (Emory University) at the <br /> &nbsp;Bioinformatics Open Source Conference in Boston, July 2010</p><p>A <a href="http://screencast.g2.bx.psu.edu/cloud/"><span style="text-decoration: underline;"> screencast</span></a> that provides a step-by-step guide to starting a Galaxy cluster in the EC2 environment</p><p>A <a href="https://bitbucket.org/galaxy/galaxy-central/wiki/cloud"><span style="text-decoration: underline;">webpage</span></a> that has the same information in text form, and is the basis for the screencast</p><p>The iPlant Collaborative, an NSF-funded project to create computational resources for plant biology research, provides access to cloud computing resources through <span style="text-decoration: underline;"><a href="http://www.iplantcollaborative.org/discover/atmosphere">Atmosphere</a></span></p><p>SeqWare Query Engine: storing and searching sequence data in the cloud. OConnor et al, BMC Bioinformatics <strong>11</strong>(Suppl 12)<strong>:</strong>S2, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S2"><span style="text-decoration: underline;">Full Text</span></a></p><p>An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. Taylor, BMC Bioinformatics <strong>11</strong>(Suppl 12)<strong>:</strong>S1, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S1"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Links to Linux command-line tutorials and resources</h3><p>Tutorials for AWK, a powerful tool for handling data tables</p><ul>
<li>A set of <a href="http://people.bu.edu/scottm/AWK.NOTES"><span style="text-decoration: underline;">awk notes</span></a> from Boston University</li>
<li>Bruce Barnett's <a href="http://www.grymoire.com/Unix/Awk.html"><span style="text-decoration: underline;">awk tutorial</span></a></li>
<li>Greg Goebel's <a href="http://www.vectorsite.net/tsawk.html"><span style="text-decoration: underline;">awk tutorial</span></a></li>
<li><a href="http://teaching.software-carpentry.org/2013/01/16/1433/"><span style="text-decoration: underline;">Executing an awk command from R</span></a> to simplify data exploratory analysis, from Lex Nederbragt</li>
</ul><p>Tutorials for bash shell scripting</p><ul>
<li>A <a href="http://www.linuxconfig.org/bash-scripting-tutorial"><span style="text-decoration: underline;">tutorial</span></a> at linuxconfig.org</li>
<li>A <a href="http://www.hypexr.org/bash_tutorial.php"><span style="text-decoration: underline;">Getting Started With Bash</span></a> tutorial at hypexr.org</li>
<li>Mendel Cooper's <a href="http://tldp.org/LDP/abs/html/"><span style="text-decoration: underline;">Advanced Bash Shell-Scripting Guide</span></a></li>
</ul><p>Tutorials for sed, the command-line stream editor</p><ul>
<li>A <a href="http://www.panix.com/%7Eelflord/unix/sed.html"><span style="text-decoration: underline;">tutorial</span></a> at Rutgers</li>
<li>Peteris Krumins claims to have the <a href="http://www.catonmat.net/blog/worlds-best-introduction-to-sed/"><span style="text-decoration: underline;"> World's Best Introduction to Sed</span></a>; take a look and judge for yourself.</li>
<li>Bruce Barnett's <a href="http://www.grymoire.com/Unix/Sed.html"><span style="text-decoration: underline;">sed tutorial</span></a>.</li>
</ul><h3>Links to other useful sites</h3><p>The<a href="http://seqanswers.com/"><span style="text-decoration: underline;"> SEQanswers</span></a> online community has forums on several topics related to sequencing; the bioinformatics forum is the most active.</p><p>The SEQanswers <span style="text-decoration: underline;"><a href="http://seqanswers.com/wiki/Software">Software Wiki</a></span> is a list of software for analysis of sequencing data</p><p><a href="http://biostar.stackexchange.com/">Biostar</a> is another online community for questions and answers on bioinformatics and computational genomics.</p><p>Information on file formats used by the University of California - Santa Cruz Genome Browser is on the <a href="http://genome.ucsc.edu/FAQ/FAQformat"><span style="text-decoration: underline;"> FAQ list</span></a></p><p>A manual for the Integrated Genome Browser visualization tool is <a href="http://wiki.transvar.org/confluence/display/igbman/Home"><span style="text-decoration: underline;">here</span></a></p><p>Course materials for a short course entitled <a href="http://bioconductor.org/help/course-materials/2010/SeattleIntro/"><span style="text-decoration: underline;">Introduction to R and Bioconductor</span></a>, held in Seattle in Dec 2010</p><p><a href="http://great.stanford.edu/"><span style="text-decoration: underline;">Genomic Regions Enrichment of Annotations Tool</span></a> - A web service to test for over-representation of specific ontology categories among genes near ChIP-seq peaks</p><p><a href="http://www.animalgenome.org/bioinfo/resources/nextgensoft.html"><span style="text-decoration: underline;">Next-gen-seq software</span></a> - a list of software packages, both commercial and open-source, related to analysis of deep sequencing datasets</p><p><a href="http://www.cbcb.umd.edu/software/"><span style="text-decoration: underline;">Software</span></a> from the Center for Bioinformatics and Computational Biology, University of Maryland - many useful programs, all open-source</p><p><a href="http://bioinformatics.psb.ugent.be/plaza/"><span style="text-decoration: underline;"> PLAZA</span></a>: a comparative genomics resource to study gene and genome evolution in plants; described by Proost et al, Plant Cell 21:3718, 2010 <a href="http://www.plantcell.org/content/21/12/3718.full"><span style="text-decoration: underline;">Full Text</span></a></p><p>The European Bioinformatics Institute provides tools <a href="http://www.ebi.ac.uk/Tools/rcloud/"><span style="text-decoration: underline;">ArrayExpressHTS</span><span style="text-decoration: underline;"> and R-Cloud</span></a> for analysis of transcriptome data</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>