To decide which strategy should be our “preferred” genome assembly approach based on data rather than my gut-feeling about the “best assembly” I decided to do some testing with a known “true” reference E Coli K12 MG1655
github.com - wgd is a easy to use command-line tool for KS distribution construction named wgd. The wgd suite provides commonly used KS and colinearity analysis workflows together with tools for modeling and visualization, rendering these...
bioinf.shenwei.me - FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only...
bix.ucsd.edu - BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands to tens of thousands of bases long with divergence between the read and genome dominated by insertion and deletion...
Integrated solutions * CLCbio Genomics Workbench - de novo and reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection,...
github.com - Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). The software is currently alpha level, feel free to use and report issues encountered.
Canu is...