github.com - GFA (Graphical Fragment Assembly) is an emerging standard format for representing sequence graphs. Although it was originally conceived as a format for sequence assembly (hence the name), and this remains its core application, it is more general,...
csb5.github.io - LoFreq* (i.e. LoFreq version 2) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or...
github.com - MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin)...
www.science.org - The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
compbio.case.edu - Seal is a comprehensive sequencing simulation and alignment tool evaluation suite. This software (implemented in Java) provides several utilities that can be used to evaluate alignment algorithms, including:
Reading a pre-existing reference...
bioinf.spbau.ru - SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. This manual will help you to install and run SPAdes. SPAdes version 3.7.1 was released under GPLv2 on March 8,...
www.homolog.us - These tutorials are written for hundreds of bioinformaticians trying to cope with large volume of next-generation sequencing (NGS) data. NGS technologies brought a dramatic shift in the world of sequencing. Merely five years back, genome sequencing...
www.jcvi.org - CABOG (Celera Assembler with Best Overlap Graph) is scientific software for DNA research. CABOG has been a critical component of many genome sequencing projects. CABOG operates on small genomes such as bacterial as well as large genomes such as...
disco.omicsbio.org - DISCO is a multi threaded and multiprocess distributed memory overlap-layout-consensus (OLC) metagenome assembler. Disco was developed as a scalable assembler to assemble large metagenomes from billions of Illumina sequencing reads of complex...
github.com - MARVEL consists of a set of tools that facilitate the overlapping, patching, correction and assembly of noisy (not so noisy ones as well) long reads.
The assembly process can be summarized as follows:
overlap
patch reads
overlap...