bioconductor.org - This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with...
www.broadinstitute.org - The presentations below were filmed during the March 2015 GATK Workshop, part of the BroadE Workshop series. At the time of this workshop, the current version of Broad’s Genome Analysis Toolkit (GATK) was version...
github.com - Snippy finds SNPs between a haploid reference genome and your NGS sequence reads. It will find both substitutions (snps) and insertions/deletions (indels). It will use as many CPUs as you can give it on a single computer (tested to 64 cores). It is...
github.com - Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the...
github.com - BFC is a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data. It is specifically designed for high-coverage whole-genome human data, though also performs well for small genomes.
The BFC algorithm is a...
code.google.com - d2Tools are the toolbox for counting the frequency of K-tuple from sequencing datasets and then calculating the pairwise dissimilarity matrix between samples with the d2-style(d2/d2*/d2S representing d2/d2Star/d2shepp, respectively)...
www.ensembl.org - For each variant that is mapped to the reference genome, we identify all overlapping Ensembl transcripts. We then use a rule-based approach to predict the effects that each allele of the variant may have on each transcript. The set of consequence...
bioinfo.ut.ee - FastGT is a program package for whole-genome genotyping of genome variants directly from raw sequencing reads. It is written in C and runs in Linux. FastGT uses a list of variant-specific k-mer pairs that are unique in human genome, counts the...
https://genome10k.soe.ucsc.edu
The Genome 10K project aims to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus. The trajectory of cost reduction...
genome.sph.umich.edu - vt is a variant tool set that discovers short variants from Next Generation Sequencing data.
https://genome.sph.umich.edu/wiki/Vt
https://github.com/atks/vt