Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Samtools commands for bioinformatician !

  • Public
By Shruti Paniwala 1838 days ago
## count mapped reads samtools view -c -F 260 mapping_file.bam ### converting sam file into fasta samtools fasta reads_mapped.sam > reads.fasta ### converting sam file into bam # -b : output is bam # -S : input is sam # -o : output file name samtools view -b -S -o sal_sej.bam sal_sej.sam ### viewing bam files (view command) samtool view sal_sej.bam | less ### sort reads by flag specified and show them # -f INT show flag matches samtools -f 4 sal_sej.bam | less # -F INT show reads excepting flag matches samtools -F 4 sal_sej.bam | less ### count reads (-c) by flag specified samtool view -c -f 4 sal_sej.bam ### count reads by quality value specified (-q) (>=) # -q : minimal quality value samtools view -q 42 -c sal_sej.bam ### sorting bam file by genome position samtools sort sal_sej.bam > sal_sej_sorted.bam ### indexing sorted file # ouput is file.bai # always index sorted files samtools index sal_sej_sorted.bam.bam ### identifying genome variants (mpileup command) # -g : output is bcf (binary call format) file # -f : use reference genome given samtools mpileup -g -f sal_ref_sej.fasta sal_sej_sorted.bam.bam > sal_vars.bcf ### calling snp and indels # -c : find snp # -v : output only potential variants bcftools view -c -v sal_vars.bcf > sal_vars.vcf ### note: new command is "call". type "bcftools" to see current version and commands ### calling snp and indels with no frequency threshold # omit -v parameter bcftools call -c sal_vars.bcf > sal_vars.vcf ### normalize (realign) indels # -f : reference fasta, needed to left align and normalize bcftools norm -f ./input_data/ref.fasta \ vars.vcf \ -o vars_indels_realigned.vcf ### show alignment # first arg is sorted bam file # second arg is reference samtools tview sal_sej_sorted.bam.bam sal_ref_sej.fasta ### results into txt file samtools depth /path/to/sorted_bam.bam > /path/to/coverage_results.txt ### note: returns count of depth at each position. input should be sorted ### count depth at each position and put it into a txt file # -a : at all positions samtools depth -a sorted_dupremoved.bam > depth.txt ### one liner to count mean depth samtools depth -a sorted_dupremoved.bam | awk '{c++;s+=$3}END{print s/c}' ### one liner to count coverage breadth samtools depth -a sorted_dupremoved.bam | awk '{c++; if($3>0) total+=1}END{print (total/c)*100}' ### flagstat # output file has two columns: QC-passed reads and QC-failed reads # and rows: total reads, duplicates, mapped. samtools flagstat sorted_dupremoved.bam > flagstat.txt #################################### # try these recipies: ### convert BAM to FASTA: samtools view filename.bam | awk '{OFS="\t"; print ">"$1"\n"$10}' - > filename.fasta ## converting BAM to SAM: samtools view -h -o out.sam in.bam ## "samtools view -h <in.bam> > <out.sam>"