I am trying to align PAcBio data (data from third generation sequencer) using bwa mem. I am getting very less number of aligned reads in SAM file having coverage of less than 10X. I tried to use minimap2 as well for alignment but again the quality of sam file isn,t good. I am getting only few unique reads and they are aligning on reference with soft clipping. Moreover, my SAM file contains many cases of multimapping. The read length of spots is >2kb.
What other options are available to get better alignment results? I am working with this data for the very first time.
But you need to be extra careful, as the -x pacbio flag consider there are 15% error rates in reads
Note: minimap2 has replaced BWA-MEM for PacBio and Nanopore read alignment. It retains all major BWA-MEM features, but is ~50 times as fast, more versatile, more accurate and produces better base-level alignment.
i am using fastq dump to convert sra files into fastq.
In case of fastq this tool gives me a message on terminal "Ignoring --- number of reads as the spot length is less than 1"). if my sra file is of 6GBs, i always get fastq file less than the afforementioned size. Ideally it must be a larger fastq file (~10gbs).Why is this so?
You can try this script https://github.com/jnarayan81/BioScripts/blob/master/runMapper.sh to map long reads.
Thanks. what if i will correct the reads using hybrid error correction approach as pacBio reads contains random errors, then align them using bwa-mem?
— Nadia Baig 2044 days ago
But you need to be extra careful, as the -x pacbio flag consider there are 15% error rates in reads
Note: minimap2 has replaced BWA-MEM for PacBio and Nanopore read alignment. It retains all major BWA-MEM features, but is ~50 times as fast, more versatile, more accurate and produces better base-level alignment.
— Abhimanyu Singh 2042 days ago
Thanks. Is it a good option to correct reads first using hybrid error correction approach and then align reads on reerence?
still iam getting cases of softclips, insertions and deletions even in uniquely mapped reads.
— Nadia Baig 2042 days ago
i am using fastq dump to convert sra files into fastq.
In case of fastq this tool gives me a message on terminal "Ignoring --- number of reads as the spot length is less than 1"). if my sra file is of 6GBs, i always get fastq file less than the afforementioned size. Ideally it must be a larger fastq file (~10gbs).Why is this so?
— Nadia Baig 2042 days ago
Please start a new question thread for this "sra file conversion" error !
— Abhimanyu Singh 2041 days ago
okay
— Nadia Baig 2041 days ago