KAT is a suite of tools that analyse jellyfish hashes or sequence files (fasta or fastq) using kmer counts. The following tools are currently available in KAT:
In addition, KAT contains a python script for analysing the mathematical distributions present in the K-mer spectra in order to determine how much content is present in each peak.
This README only contains some brief details of how to install and use KAT. For more extensive documentation please visit: https://kat.readthedocs.org/en/latest/
https://academic.oup.com/bioinformatics/article/33/4/574/2664339
Comments
Using KAT again (You will need the modules:
KAT/2.1.1
andgnuplot/4.6.5
) – we can plot the kmer content of the assembly compared to the kmer content of the read set. The first thing we need to do is to combine the reads into a single file, for gzipped files, this can be done withzcat
, or for unzipped filescat
.Ex.
We will now use
kat comp
to create a kmer content comparison. Usekat comp --help
to get help for the program, then create a comparison between the combined reads and the assembly. Make sure that you use the flags for canonical hashes for both sequence 1 and 2, as well as 8 threads. Finally, clean up you working directory by removing the combined fasta file, and re-zipping any unzipped files. Then download the output files to you computer usingscp
and look at the png file that was produced.