github.com - npScarf (jsa.np.npscarf) is a program that scaffolds and completes draft genomes assemblies in real-time with Oxford Nanopore sequencing. The pipeline can run on a computing cluster as well as on a laptop computer for microbial datasets. It...
huttenhower.sph.harvard.edu - Lateral gene transfer (LGT) is an important mechanism for genome diversification in microbial communities, including the human microbiome. While methods exist to identify LGTs from sequenced isolate genomes, identifying LGTs from community...
sanger-pathogens.github.io - A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript.
Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology...
ibest.github.io - ARC is a pipeline which facilitates iterative, reference guided de novo assemblies with the intent of:
Reducing time in analysis and increasing accuracy of results by only considering those reads which should assemble...
github.com - SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and contiguity,...
amos.sourceforge.net - Genome sequencing remains an inexact science, and genome sequences can contain significant errors if they are not carefully examined. Hawkeye is our new visual analytics tool for genome assemblies, designed to aid in identifying and correcting...
crossmap.sourceforge.net - CrossMap is a program for genome coordinates conversion between different assemblies (such as hg18 (NCBI36) <=> hg19 (GRCh37)). It supports commonly used file formats...
github.com - Automatic Filtering, Trimming, Error Removing and Quality Control for fastq dataAfterQC can simply go through all fastq files in a folder and then output three folders: good, bad and QC folders, which contains good...
github.com - Other tools focus on getting data out of the fastq or fast5 files, which is slow and computationally intensive. The benefit of this approach is that it works on a single, small, .txt summary file. So it's a lot quicker than most other things out...
github.com - BFC is a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data. It is specifically designed for high-coverage whole-genome human data, though also performs well for small genomes.
The BFC algorithm is a...