Here’s the quick and dirty of what was done:
This uses a pre-built set of defaults (the ava-pb
in the code below) for analyzing PacBio data. Minimap only accepts two FASTQ files and you need to map your FASTQ file against itself. So, if you have multiple FASTQ sequencing files, you have to concatenate them into a single file prior to running minimap.
minimap2 -x ava-pb -t 23 \
20170911_oly_pacbio_cat.fastq \
20170911_oly_pacbio_cat.fastq \
> 20170911_minimap2_pacbio_oly.paf
This uses your concatenated FASTQ file and the PAF file output from the miniasm step. The code below is taken from the example provided in the miniasm documentation; there are other options available.
miniasm \
-f \
/home/data/20170911_oly_pacbio_cat.fastq /home/data/20170911_minimap2_pacbio_oly.paf > /home/data/20170918_oly_pacbio_miniasm_reads.gfa
The FASTA file is needed to re-run minimap in Step 4 below.
awk '$1 ~/S/ {print ">"$2"\n"$3}' 20170918_oly_pacbio_miniasm_reads.gfa > 20170918_oly_pacbio_miniasm_reads.fasta
Using the default settings maps the FASTQ reads back to the contigs (the PAF file) created in the fist step. These mappings are required for Racon assembly (Step 5).
minimap2 \
-t 23 \
20170918_oly_pacbio_miniasm_reads.fasta 20170905_minimap2_pacibio_oly.paf > 20170918_minimap2_mapping_fasta_oly_pacbio.paf
The output file is the FASTA file listed below.
racon -t 24 \
20170911_oly_pacbio_cat.fastq \
20170918_oly_pacbio_minimap_mappings.paf \
20170918_oly_pacbio_miniasm_assembly.gfa \
20170918_oly_pacbio_racon1_consensus.fasta
from Sam’s Notebook http://ift.tt/2fKBPUN
minimap2 –x ava-ont \ ../../trimming_practical/nanofilt/nanofilt_trimmed.fastq \ ../../trimming_practical/nanofilt/nanofilt_trimmed.fastq \ | gzip -1 > ./minimap.paf.gz
miniasm -f \
../../trimming_practical/nanofilt/nanofilt_trimmed.fastq \
./minimap.paf.gz > miniasm.gfa
awk ’/^S/{print “>”$2”\n”$3}’ miniasm.gfa > miniasm.fasta
assembly-stats ./miniasm.fasta
dnadiff -p dnadiff ~/course_data/precompiled/chr17.fasta miniasm.fasta