BOL: Question: How to check if fragmented set of assembly is alright ?

Question: Question: How to check if fragmented set of assembly is alright ?

Shruti Paniwala
2843 days ago

Question: How to check if fragmented set of assembly is alright ?

I assembled the genome, by fragmenting(split) the read data in TWO set. After assembling both sets, I am just wondering what to do the next? How to validate? Is that everything is going alright?

I only use QUAST and it seems OK to me. Any other suggestions?

Note: I assembled the genome with MIRA

Answers

Here are the following steps I suggest:

1) map the reads on the contigs in order to determine the average coverage (and see whether 25% of the reads is an amount too low or actually OK in view of assembler/MIRA requirements.

2) plot the histogram of the per-base coverage to see whether it comprises two peaks (indicating that some alleles were not resolved) or a single peak; you can do this using SAMtools and BEDtools (specifically the GenomeCoverageBED function of BEDtools);

3) try to scaffold the contigs (if it turns out that MIRA separated the haplotypes better than DDN but produced shorted contigs, you could try to use SSPACE-long-reads or Bambus2 to scaffold the MIRA assembly using the DDN assembly).

Jit 2842 days ago

BOL

Shruti Paniwala

Our Sponsors

Question: Question: How to check if fragmented set of assembly is alright ?