Alternative content
from a Broad Institute site:
"N50 is a statistical measure of average length of a set of sequences. It is used widely in genomics, especially in reference to contig or supercontig lengths within a draft assembly.
Given a set of sequences of varying lengths, the N50 length is defined as the length N for which 50% of all bases in the sequences are in a sequence of length L < N. This can be found mathematically as follows: Take a list L of positive integers. Create another list L' , which is identical to L, except that every element n in L has been replaced with n copies of itself. Then the median of L' is the N50 of L. For example: If L = {2, 2, 2, 3, 3, 4, 8, 8}, then L' consists of six 2's, six 3's, four 4's, and sixteen 8's; the N50 of L is the median of L' , which is 6. "
The original site has a self-issued security certificate, but you can accept it and look here:
https://www.broad.harvard.edu/crd/wiki/index.php/N50
Hi Gudiya,
N50 can be defined as the largest entity E such that at least half of the total size of the entities is contained in entities larger than E. For example if we have a collection of contigs with sizes 7, 4, 3, 2, 2, 1, and 1 kb (total size = 20kbp), the N50 length is 4 because we can cover 10 kb with contigs bigger than 4kb.
L50 is the number of scaffolds that accounts for more than 50% of the genome assembly.
Cheers