BOL: CNIDARIA: fast, reference-free phylogenomic clustering

Bookmarks
Shruti Paniwala
CNIDARIA: fast, reference-free phylogenomic clustering

CNIDARIA: fast, reference-free phylogenomic clustering

https://github.com/sauloal/cnidaria/wiki

Motivation: Identification of biological specimens is a major requirement for a range of applications. Reference-free methods analyse unprocessed sequencing data without relying on prior knowledge, but these do not scale to arbitrarily large genomes and arbitrarily large phylogenetic distances.

Results: We present Cnidaria, a practical tool for clustering genomic and transcriptomic data with no limitation on ge-nome size or phylogenetic distances. We successfully simultaneously clustered 169 genomic and transcriptomic datasets from 4 kingdoms, achieving 100% accuracy at supra-species level and 78% accuracy for species level.

Availability and Implementation: Cnidaria is written in C++ and Python and is available at http://www.ab.wur.nl/cnidaria.

Contact: Saulo Aflitos - sauloal@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

BOL

Shruti Paniwala

Our Sponsors

CNIDARIA: fast, reference-free phylogenomic clustering