COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly
An efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30× simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, whic...Tags: COPE, accurate, k-mer, pair-end, reads, connection, tool, genome, assembly
2333 days ago
Musket: a multistage k-mer spectrum based corrector
Musket is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing. This corrector employs the k-mer spectrum approach and introduces three correction techniques in a multistage workflow. Our performance evaluation results, in ...Tags: Musket, multistage, k-mer, spectrum, corrector
2333 days ago
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity
The k-mer Weighted Inner Product. This software implements a de novo, alignment free measure of sample genetic dissimilarity which operates upon raw sequencing reads. It is able to calculate the genetic dissimilarity between samples without any reference genome, and without assembling one. ...Tags: kWIP, k-mer, weighted, inner, product, de novo, estimator, genetic, similarity
2159 days ago
SKESA: strategic k-mer extension for scrupulous assemblies
SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and contiguity, handles low-level contamination in reads, is fast,...Tags: SKESA, strategic, k-mer, extension, scrupulous, assemblies, genome, assembly
1990 days ago
Musket - a multistage k-mer spectrum based corrector
Musket is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing. This corrector employs the k-mer spectrum approach and introduces three correction techniques in a multistage workflow. Our performance evaluation results, in ...Tags: Musket, multistage, k-mer, spectrum, corrector
1543 days ago
Tags: k-mer, counting, learn, teach, skill, genome, reads
982 days ago
k-mers tutorial - classification and taxonomy
DNA k-mers underlie much of our assembly work, and we (along with many others!) have spent a lot of time thinking about how to store k-mer graphs efficiently, discard redundant data, and count them efficiently. More recently, we've been enthused about using k-mer based simila...Tags: kmer, k-mer, taxonomy, classification, tree, plot, database, similarity, comparision
974 days ago
Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validation
Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller’s interna...Tags: Merfin, improved, variant, filtering, assembly, evaluation, polishing, k-mer, validation
754 days ago
Tags: Tallymer, method, compute, K-mer, frequencies, application, annotate, large, repetitive, plant, genomes
2262 days ago
KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies
KAT is a suite of tools that analyse jellyfish hashes or sequence files (fasta or fastq) using kmer counts. The following tools are currently available in KAT: hist: Create an histogram of k-mer occurrences from a sequence file. Adds metadata in output for easy plotting. gcp: K-mer GC Pr...Tags: KAT, K-mer, analysis, toolkit, quality, control, NGS, datasets, genome, assemblies
1973 days ago