<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Samtools commands for bioinformatician !]]></title>
	<link>https://bioinformaticsonline.com/snippets/view/40391/samtools-commands-for-bioinformatician?</link>
	<atom:link href="https://bioinformaticsonline.com/snippets/view/40391/samtools-commands-for-bioinformatician?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/40391/samtools-commands-for-bioinformatician</guid>
	<pubDate>Sun, 15 Dec 2019 10:31:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/40391/samtools-commands-for-bioinformatician</link>
	<title><![CDATA[Samtools commands for bioinformatician !]]></title>
	<description><![CDATA[<code>## count mapped reads
samtools view -c -F 260 mapping_file.bam


### converting sam file into fasta
samtools fasta reads_mapped.sam &gt; reads.fasta

### converting sam file into bam
# -b : output is bam
# -S : input is sam
# -o : output file name
samtools view -b -S -o sal_sej.bam sal_sej.sam

### viewing bam files (view command)
samtool view sal_sej.bam | less


### sort reads by flag specified and show them
# -f INT show flag matches
samtools -f 4 sal_sej.bam | less

# -F INT show reads excepting flag matches
samtools -F 4 sal_sej.bam | less


### count reads (-c) by flag specified
samtool view -c -f 4 sal_sej.bam


### count reads by quality value specified (-q) (&gt;=)
# -q : minimal quality value
samtools view -q 42 -c sal_sej.bam



### sorting bam file by genome position
samtools sort sal_sej.bam &gt; sal_sej_sorted.bam

### indexing sorted file
# ouput is file.bai
# always index sorted files
samtools index sal_sej_sorted.bam.bam

### identifying genome variants (mpileup command)
# -g : output is bcf (binary call format) file
# -f : use reference genome given
samtools mpileup -g -f sal_ref_sej.fasta sal_sej_sorted.bam.bam &gt; sal_vars.bcf


### calling snp and indels
# -c : find snp
# -v : output only potential variants
bcftools view -c -v sal_vars.bcf &gt; sal_vars.vcf
### note: new command is &quot;call&quot;. type &quot;bcftools&quot; to see current version and commands

### calling snp and indels with no frequency threshold
# omit -v parameter
bcftools call -c sal_vars.bcf &gt; sal_vars.vcf

### normalize (realign) indels
# -f : reference fasta, needed to left align and normalize
bcftools norm -f ./input_data/ref.fasta \
vars.vcf \
-o vars_indels_realigned.vcf

### show alignment
# first arg is sorted bam file
# second arg is reference 
samtools tview sal_sej_sorted.bam.bam sal_ref_sej.fasta

### results into txt file
samtools depth /path/to/sorted_bam.bam &gt; /path/to/coverage_results.txt
### note: returns count of depth at each position. input should be sorted

### count depth at each position and put it into a txt file
# -a : at all positions
samtools depth -a sorted_dupremoved.bam &gt; depth.txt

### one liner to count mean depth
samtools depth -a sorted_dupremoved.bam | awk &#039;{c++;s+=$3}END{print s/c}&#039;

### one liner to count coverage breadth
samtools depth -a sorted_dupremoved.bam | awk &#039;{c++; if($3&gt;0) total+=1}END{print (total/c)*100}&#039;

### flagstat
# output file has two columns: QC-passed reads and QC-failed reads
# and rows: total reads, duplicates, mapped.
samtools flagstat sorted_dupremoved.bam &gt; flagstat.txt

####################################
# try these recipies: 
### convert BAM to FASTA:
samtools view filename.bam | awk &#039;{OFS=&quot;\t&quot;; print &quot;&gt;&quot;$1&quot;\n&quot;$10}&#039; - &gt; filename.fasta

## converting BAM to SAM:
samtools view -h -o out.sam in.bam
## &quot;samtools view -h &lt;in.bam&gt; &gt; &lt;out.sam&gt;&quot;</code>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>

</channel>
</rss>