CollectGcBiasMetrics.jar will generate a GC bias plot for each contig
samtools index aln-pe.mapped.sorted.bam for i in $(samtools view -H aln-pe.mapped.sorted.bam | awk -F"\t" '/@SQ/{gsub("^SN:","",$2);print $2}' ); do samtools view -b aln-pe.mapp...2003 days ago
Split the multifasta in separate files !
cat Avaga_allPalindrome.fa | awk '{ if (substr($0, 1, 1)==">") {filename=(substr($0,2) ".fa")} print $0 > filename }'1957 days ago
1604 days ago
Samtools commands for bioinformatician !
...one liner to count mean depth samtools depth -a sorted_dupremoved.bam | awk '{c++;s+=$3}END{print s/c}'...ner to count coverage breadth samtools depth -a sorted_dupremoved.bam | awk '{c++; if($3>0) total+=1}END{...1602 days ago
Onliner to split the multifasta to singlefasta files !
#Split the multifasta to singlefasta # Multi fasta #Single fasta awk '$0 ~ "^>" { match($1, /^>([^:]+)/, id); filename=id[1]} {print >> filename".fa"}' sequence.fasta1400 days ago
get GC across the entire CDS !
#look at GC across the entire CDS. gffread -x - -g | \ seqtk comp - | \ awk -v OFS="\t" '{ print $1, "0", $2, ($4 + $5) / $2 }'1392 days ago
Reformat the multifasta for sequence length !
#awk oneliner to reformat the multifasta sequences awk '!/^>/ {printf "%s", $0; n = "\n"} /^>/ {print n $0; n = ""}' file.fasta | fold -w 1001391 days ago
Bash script to get exon fragments from genome files !
#Exons are already defined in the GTF file, so we simply need to print lines that are marked exonic. gunzip -c genome_file.gtf.gz | awk 'BEGIN{OFS="\t";} $3=="exon" {print $1,$4-1,$5}' | bedtools sort | bedtools merge -i - | gzip > my_exon.bed.gz1365 days ago
Bash script to extract intronic fragments !
...and exonic coordinates; #by subtracting the exonic regions from the genic region, we have the intronic region. gunzip -c genome_file.gtf.gz | awk 'BEGIN{OFS="\t";} $3=="gene"...1365 days ago
Bash script to get intergenic region from genome files !
...wget http://xxx.chrom.sizes cat xxx.chrom.sizes | sed 's/^chr//' | sed 's/Cp/Pt/' > tmp mv tmp xxx.chrom.sizes gunzip -c genome_file.gtf.gz | awk 'BEGIN{OFS="\t";} $3=="gene"...1365 days ago