sourmash is a k-mer analysis multitool, and we aim to provide stable, robust programmatic and command-line APIs for a variety of sequence comparisons. Some of our special sauce includes:
FracMinHash
sketching, which enables accurate comparisons (including ANI) between data sets of different sizessourmash gather
, a combinatorial k-mer approach for more accurate metagenomic profilingPlease see the sourmash publications for details.
The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)
Maintainers: C. Titus Brown (@ctb), Luiz C. Irber, Jr (@luizirber), and N. Tessa Pierce-Ward (@bluegenes).
sourmash was initially developed by the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine, and now includes contributions from the global research and developer community.