github.com - Hagfish is a tool that is to be used in data analysis of Next Generation Sequencing (NGS) experiments. Hagfish builds on the concept of coverage plots and aims to assist (amongst others) in quality control of de novo genome assembly or...
We are a computational biology lab that develops novel methods for analysis of DNA and RNA sequences. Our research includes software for aligning and assembling RNA-seq data, whole-genome assembly, and microbiome analysis. We work closely with...
www.jcvi.org - CABOG (Celera Assembler with Best Overlap Graph) is scientific software for DNA research. CABOG has been a critical component of many genome sequencing projects. CABOG operates on small genomes such as bacterial as well as large genomes such as...
www.broadinstitute.org - The Genome Assembly Evaluation Metrics and Reporting (GAEMR) package is an assembly analysis framework composed a number of integrated modules. These modules can be executed as a single program to generate a complete...
broadinstitute.github.io - Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the Hts-specs repository. See especially the SAM specification and the VCF...
www.ncbi.nlm.nih.gov - Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome...
github.com - SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.
More at
https://github.com/jts/sga
SGA...
www.topcoder.com - Learning greedy algo for biologist.
https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/
This webpage is also useful for the...
The genome assemblers generally take a file of short sequence reads and a file of quality-value as the input. Since the quality-value file for the high throughput short reads is usually highly memory-intensive, only a few assemblers, best suited for...
neufeldserver.uwaterloo.ca - PANDASEQ assembles paired-end Illumina reads into sequences, trying to correct for errors and uncalled bases. The assembler reads two files in FASTQ format with quality information. If amplification primers were used (e.g., to isolate a variable...