pyScaf orders contigs from genome assemblies utilising several types of information:
In reference-based mode, pyScaf uses synteny to the genome of closely related species in order to order contigs and estimate distances between adjacent contigs.
Contigs are aligned globally (end-to-end) onto reference chromosomes, ignoring:
In preliminary tests, pyScaf performed superbly on simulated heterozygous genomes based on C. parapsilosis (13 Mb; CANPA) and A. thaliana (119 Mb; ARATH) chromosomes, reconstructing correctly all chromosomes always for CANPA and nearly always for ARATH (Figures in dropbox, CANPA table, ARATH table).
Runs took ~0.5 min for CANPA on
4 CPUs and ~2 min for ARATH on