b) filter required unmapped reads # SAMtools SAM-flag filter: get unmapped pairs (both ends unmapped) samtools view -b -f 12 -F 256 SAMPLE_mapped_and_unmapped.bam > SAMPLE_bothEndsUnmapped.bam -f 12 Extract only (-f) alignments with both reads unmapped: <read unmapped><mate unmapped> -F 256 Do not(-F) extract alignments which are: <not primary alignment> see meaning ofSAM-flags
c) split paired-end reads into separated fastq files .._r1 .._r2
# sort bam file by read name (-n) to have paired reads next to each other as required by bedtools samtools sort-n SAMPLE_bothEndsUnmapped.bam SAMPLE_bothEndsUnmapped_sorted
a) bowtie2 mapping against host sequence
Host example: human genome hg19 (download bowtie2 hg19 index)
# 1) create bowtie2 index database (host_DB) from host reference genome
bowtie2-build host_genome.fna host_DB
# 2) bowtie2 mapping against host sequence database, keep both mapped and unmapped reads (paired-end reads)
bowtie2 -x host_DB -1 SAMPLE_r1.fastq -2 SAMPLE_r2.fastq -S SAMPLE_mapped_and_unmapped.sam
# 3) convert file .sam to .bam
samtools view -bS
SAMPLE_mapped_and_unmapped.sam
> SAMPLE_mapped_and_unmapped.bamsamtools view -b -f 12 -F 256 SAMPLE_mapped_and_unmapped.bam > SAMPLE_bothEndsUnmapped.bam
-f 12
Extract only (-f
) alignments with both reads unmapped: <read unmapped><mate unmapped>-F 256
Do not(-F
) extract alignments which are: <not primary alignment>see meaning of SAM-flags
samtools sort -n SAMPLE_bothEndsUnmapped.bam SAMPLE_bothEndsUnmapped_sorted
bedtools bamtofastq -i SAMPLE_bothEndsUnmapped_sorted.bam -fq SAMPLE_host_removed_r1.fastq -fq2 SAMPLE_host_removed_r2.fastq