BOL: Question: How to estimate the heterozygosity ?

Question: Question: How to estimate the heterozygosity ?

BioJoker
2309 days ago

Question: How to estimate the heterozygosity ?

I am working on a new genome, and wondering about prediction methods for heterozygosity in the genome. Any tools suggestions and helps are welcome.

Answers

You can try many software, but the quickest one are BBTools/kmercountexact.sh

To uses approximate counts:

khist.sh in=reads.fq khist=khist.txt peaks=peaks.txt

To uses exact counts (and thus potentially more memory)

kmercountexact.sh in=reads.fq khist=khist.txt peaks=peaks.txt

The peaks file header contains estimates of genome size and heterozygousity. You can also add the flag "ploidy=2" for diploid organisms, so that it won't need to autodetect the ploidy (and thus potentially make a mistake).

Rahul Nayak 2309 days ago

I just came across this paper on arxiv "Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects"

https://arxiv.org/abs/1308.2012

It is available at ftp://ftp.genomics.org.cn/pub/gce/

Neel 2308 days ago

BOL

BioJoker

Our Sponsors

Question: Question: How to estimate the heterozygosity ?