kallisto
Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)
Software (C++)
https://pachterlab.github.io/kallisto/
Sailfish
Estimation of isoform abundances from reference sequences and RNA-seq data (k-mer based)
Software (C++)
http://www.cs.cmu.edu/~ckingsf/software/sailfish/
Salmon
Quantification of the expression of transcripts using RNA-seq data (uses k-mers)
https://combine-lab.github.io/salmon/
RNA-Skim
RNA-seq quantification at transcript-level (partitions the transcriptome into disjoint transcript clusters; uses sig-mers, a special type of k-mers)
Software (C++)
Variant calling
ChimeRScope
Fusion transcript prediction using gene k-mers profiles of the RNA-seq paired-end reads
Software (Java)
https://github.com/ChimeRScope/ChimeRScope/wiki
FastGT
Genotyping of known SNV/SNP variants directly from raw NGS sequence reads by counting unique k-mers
Software (C)
https://github.com/bioinfo-ut/GenomeTester4/
Phy-Mer
Reference-independent mitochondrial haplogroup classifier from NGS data (k-mer based)
Software (Python)
https://github.com/danielnavarrogomez/phy-mer
LAVA
Genotyping of known SNPs (dbSNP and Affymetrix's Genome-Wide Human SNP Array) from raw NGS reads (k-mer based)
Software (C)
MICADo
Detection of mutations in targeted third-generation NGS data (can distinguish patients’ specific mutations; algorithm uses k-mers and is based on colored de Bruijn graphs)
Software (Python)
General mapper
Minimap
Lightweight and fast read mapper and read overlap detector (uses the concept of “minimazers”, a special type of k-mers)
Software (C)
https://github.com/lh3/minimap
Assembly
De novo genome assembly
MHAP
Produces highly continuous assembly (fully resolved chromosome arms) from third-generation long and noisy reads (10 kbp) using a dimensionality reduction technique MinHash
Software (Java)
Miniasm
Assembler of long noisy reads (SMRT, ONT) using the Overlap-Layout Consensus (OLC) approach without the necessity of an error correction stage (uses minimap)
Software (C)
https://github.com/lh3/miniasm
LINKS
Scaffolding genome assembly with error-containing long sequence (e.g., ONT or PacBio reads, draft genomes)
Software (Perl)
https://github.com/warrenlr/LINKS/
Read clustering
afcluster
Clustering of reads from different genes and different species based on k-mer counts
Software (C++)
https://github.com/luscinius/afcluster
QCluster
Clustering of reads with alignment-free measures (k-mer based) and quality values
Software (C++)
http://www.dei.unipd.it/~ciompin/main/qcluster.html
Reads error correction
Lighter
Correction of sequencing errors in raw, whole genome sequencing reads (k-mer based)
Software (C++)
https://github.com/mourisl/Lighter
QuorUM
Error corrector for Illumina reads using k-mers
Software (C++)
https://github.com/gmarcais/Quorum
Trowel
Software (C++)
https://sourceforge.net/projects/trowel-ec/
Metagenomics
Assembly-free phylogenomics
AAF
Phylogeny reconstruction directly from unassembled raw sequence data from whole genome sequencing projects; provides bootstrap support to assess uncertainty in the tree topology (k-mer based)
Software (Python)
https://github.com/fanhuan/AAF
kSNP v3
Reference-free SNP identification and estimation of phylogenetic trees using SNPs (based on k-mer analysis)
Software (C)
https://sourceforge.net/projects/ksnp/files/
NGS-MC
Phylogeny of species based on NGS reads using alignment-free sequence dissimilarity measures d2* and d2 S under different Markov chain models (using k-words)
R package
http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html
Species identification/taxonomic profiling
CLARK
Taxonomic classification of metagenomic reads to known bacterial genomes using k-mer search and LCA assignment
Software (C++)
FOCUS
Reports organisms present in metagenomic samples and profiles their abundances (uses composition-based approach and non-negative least squares for prediction)
Web service Software (Python)
http://edwards.sdsu.edu/FOCUS/
GSM
Estimation of abundances of microbial genomes in metagenomic samples (k-mer based)
Software (Go)
https://github.com/pdtrang/GSM
Mash
Species identification using assembled or unassembled Illumina, PacBio, and ONT data (based on MinHash dimensionality-reduction technique)
Software (C++)
Kraken
Taxonomic assignment in metagenome analysis by exact k-mer search; LCA assignment of short reads based on a comprehensive sequence database
Software (C++)
https://ccb.jhu.edu/software/kraken/
LMAT
Assignment of taxonomic labels to reads by k-mers searches in precomputed database
Software (C++/Python)
https://sourceforge.net/projects/lmat/
stringMLST
k-mer-based tool for MLST directly from the genome sequencing reads
Software (Python)
http://jordan.biology.gatech.edu/page/software/stringMLST
Taxonomer
k-mer-based ultrafast metagenomics tool for assigning taxonomy to sequencing reads from clinical and environmental samples
Web service
Other
d2-tools
Word-based (k-tuple) comparison (pairwise dissimilarity matrix using d2S measure) of metatranscriptomic samples from NGS reads
Software (Python/R)
https://code.google.com/p/d2-tools/
VirHostMatcher
Prediction of hosts from metagenomic viral sequences based on ONF using various distance measures (e.g., d2)
Software (C++)
https://github.com/jessieren/VirHostMatcher
MetaFast
Statistics calculation of metagenome sequences and the distances between them based on assembly using de Bruijn graphs and Bray–Curtis dissimilarity measure
Software (Java)