Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Genome assembly stats plotting

https://github.com/rjchallis/assembly-stats

de novo genome assembly can be summarised b

y a number of metrics, including:

  • Overall assembly length
  • Number of scaffolds/contigs
  • Length of longest scaffold/contig
  • Scaffold/contig N50 and N90Assembly base composition, in particular percentage GC and percentage Ns
  • CEGMA completeness
  • Scaffold/contig length/count distribution

assembly-stats supports two widely used presentations of these values, tabular and cumulative length plots, and introduces an additional circular plot that summarises most commonly used assembly metrics in a single visualisation. Each of these presentations is generated using javascript from a common (JSON) data structure, allowing toggling between alternative views, and each can be applied to a single or multiple assemblies to allow direct comparison of alternate assemblies.

Tabular presentation allows direct comparison of exact values between assemblies, the limitations of this approach lie in the necessary omission of distributions and the challenge of interpreting ratios of values that may vary by several orders of magnitude.