Alternative content
Instructions: set the read length/configuration and genome size, then select what you want to calculate.
Written by Stephen Turner, based on the Lander-Waterman formula, inspired by a similar calculator written by James Hadfield. Coverage is calculated as C=LN/G and reads as N=CG/L where C = Coverage (X),L = Read length (bp), G = Haploid genome size (bp), and N = Number of reads. Source code on GitHub.
Comments
Sequencing coverage is defined as the average number of reads that covers each base of the reference genome. Estimating the sequencing coverage is very important when you are simulating datasets. The coverage equation is defined as follows.
For example, if you have a genome of length 5Mbp and you simulate 1,000,000 HiSeq 2000 reads (read length is 100bp), then we will get a sequencing coverage of
20x
as follows.C = LN / G = 100 * 1,000,000 / 5,000,000 = 20x