Genome scaffolding is a classical challenging problem in bioinformatics. It refers to joining assembly contigs into chains (called scaffolds). The join between two contigs A and B is considered correct if:
The problem of scaffolding validation is also a challenging one. One of the main issues which hinders from an adequate scaffolding evaluation are genome repeats. The previous standard for evaluation (Hunt et al., Genome Biology, 2014) did not take into account repeats. In this evaluation framework, repeats are taken into account.
The new evaluation framework considers the optimal assignment of contigs in the output scaffolding to contigs in the reference scaffolding in the sense of the number of correct links.