  • Install Varscan on Ubuntu / Linux !

    #Varscan is a java program designed to call variants in sequencing data. It was developed at the Genome Ins...versity and is hosted on github. To use Varscan we simply need to download...ace/bin. As with the other java programs which have already been installed...

  • Bash script to split multifasta file !

    #Using awk, we can easily split a file (multi.fa) into chunks of size N (here, N=500), by.../^>/ {if(n%500==0){file=sprintf("chunk%d.fa",n);} print >> file;...; }' < multi.fa #OR awk -v chunksize=$(grep ">" is genome tools (gt), which you can find here: http://ge...

  • BBmap the reads with all alignments ! in=../reference/reference.numbered.fa ambig=all vslow perfectmode maxsites=100000 out=fetch_Ids_for_barcode.sam

  • Commands to get the detail of disk usage on Linux !

    #A simplistic approach would be du -shc /home/* du -shc /home/jnarayan #To sort it: du -smc /home/* | sort -n #There is also a wellknown Perl script that has the option of mailing disk usage reports per user: durep

  • Command line to print disk usage on Linux terminal !

    #Print disk usage - perl du -h |perl -e'%h=map{/.\s/;99**(ord$&&7)-$`,$_}`du -h`;die@h{sort%h}' #Bash du -k * | sort -nr | cut -f2 | xargs -d '\n' du -sh #Base du -scBM | sort -n #More du -s * | sort -rn | cut -f2- | xargs -d "\n" du -sh

  • Install GATK 4 using conda !

    ...oped by the broad institute focused primarily on variant discovery and genotyping. It is op...age, environment, and dependency management software, in esse.... We do this with the command conda env create, we also use t...R ubuntu:ubuntu /home/ubuntu/.conda # create conda environ...

  • Count number of lines in each file in Linux !

    for FILE in *.rd; do wc -l $FILE; done > allReads.hits

  • Bash script to transfer files to server !

    # rsync options source destination rsync -azvh --progress PacBio_clean.fa # scp source_file_name username@destination_host:destination_folder scp –rpv /datafile xxx@

  • Multiline fasta to single line fasta !

    perl -pe '$. > 1 and /^>/ ? print "\n" : chomp' in.fasta > out.fasta

  • Plot kmer stats in bash !

    #!/bin/bash #Counting k-mers for different k echo "k,unique,distinct,total" for k in {1..15}; do kat hist -o phiX_$k.hist -m $k phiX.fasta >/dev/null 2>&1 egrep -v '^#' phiX_$k.hist|awk '{if ($1==1) u=$2; d+=$2; t+=$2*$1;}\ END{print "'$k',"u","d","t}' done

