Our Sponsors

Download BioinformaticsOnline(BOL) Apps in your chrome browser.

Question: Question: How to estimate the heterozygosity ?

2207 days ago

Question: How to estimate the heterozygosity ?

I am working on a new genome, and wondering about prediction methods for heterozygosity in the genome. Any tools suggestions and helps are welcome. 


You can try many software, but the quickest one are BBTools/kmercountexact.sh

To uses approximate counts:

khist.sh in=reads.fq khist=khist.txt peaks=peaks.txt

To uses exact counts (and thus potentially more memory)

kmercountexact.sh in=reads.fq khist=khist.txt peaks=peaks.txt

The peaks file header contains estimates of genome size and heterozygousity. You can also add the flag "ploidy=2" for diploid organisms, so that it won't need to autodetect the ploidy (and thus potentially make a mistake).


I just came across this paper on arxiv "Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects"


It is available at ftp://ftp.genomics.org.cn/pub/gce/