Our Sponsors

Download BioinformaticsOnline(BOL) Apps in your chrome browser.



Run miniasm assembler on nanopore reads !: Revision

Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.

Find the detail of the reads repeats:

fq2fa ONT_A.fastq ONT_A.fasta 

minimap2 -xava-ont ONT_A.fasta ONT_A.fasta -t10 -X > AONT.paf 

awk '{if($1==$6){print}}' AONT.paf > AONTself.paf 

awk '$5=="-"' AONTself.paf | awk '{print $1}'| sort|uniq > invertedrepeat.list

Generated a few palindrome and repeats plots (highlighting only repeats largest than 10, 20 and 30 kb)

minidot -f 5 -m 30000 AONTself.paf > AONTself30000.eps 
sed 's/_template_pass_FAH31515//' AONTself30000.eps > AONTself30000final.eps 

minidot -f 5 -m 20000 AONTself.paf > AONTself20000.eps 
sed 's/_template_pass_FAH31515//' AONTself20000.eps > AONTself20000final.eps 

minidot -f 5 -m 10000 AONTself.paf > AONTself10000.eps 
sed 's/_template_pass_FAH31515//' AONTself10000.eps > AONTself10000final.eps 

Assemble with miniasm:

miniasm -f ONT_A.fasta AONT.paf > AONT.gfa 

grep '^S' AONT.gfa |awk '{print ">"$2"\n"$3}' > AONT_miniasm.fasta 

minimap2 -xasm10 AONT_miniasm.fasta AONT_miniasm.fasta -t1 -X > AONT_miniasm.paf 

awk '{if($1==$6){print}}' AONT_miniasm.paf > AONT_miniasm_self.paf 

minidot -f 5 -m 10000 AONT_miniasm_self.paf > AONT_miniasm_self10000.eps 

Njoy the assembly !