Installing SEVA environment in Conda !
...############################################################# | 100% r-snow-0.4_3 | 124 KB |...########################################################### | 100% r-dosnow-1.0.19 | 39 KB | ##...899 days ago
Extract all fasta sequences except ids !
awk 'BEGIN{while((getline0)l[">"$1]=1}/^>/{f=!l[$1]}f' genomic.fna > filtered_without_omi.fasta #extract subseq seqtk subseq omi_ids.fa omi_single_id.txt > omi_single_id.fa #...890 days ago
887 days ago
bash script to extract sequence by ids !
...s: # Create test input: cat > in.fasta BGI_novel_T016313 Solyc03g025570.2.1 TTCAAGTGTTAGTTTCACATCAT >BGI_novel_T018109 Solyc03g080075.1.1...cat out.fasta Output: >BGI_novel_T016697 Solyc03g033550.3.1...885 days ago
885 days ago
Install Varscan on Ubuntu / Linux !
#Varscan is a java program designed to call variants in sequencing data. It was developed at the Genome Institute at Washington University and is hosted on github. To use Varscan we...885 days ago
Bash script to split multifasta file !
...size/10)+1 } /^>/ {if(n%chunksize==0){file=sprintf("chunk%d.fa",n);} print >> file; n++; next;} { print >> file; }' < multi.fasta #Another great solution is genome tools (gt), which you can f...885 days ago
Commands to get the detail of disk usage on Linux !
#A simplistic approach would be du -shc /home/* du -shc /home/jnarayan #To sort it: du -smc /home/* | sort -n #There is also a wellknown Perl script that has the option of mailing disk usage reports per user: durep http://www.ubuntugeek.com/create-disk-usage-reports-with-durep.html877 days ago
873 days ago
Bash command to explore assembly summary genbank !
wget https://ftp.ncbi.nlm.nih.gov/genomes/genbank/assembly_summary_genbank.txt pip3 install csvkit csvcut -t -K 1 -c 'excluded_from_refseq' assembly_summary_genbank.txt \ | tail -n +2 | tr ";" "\n" \ | sed -e 's/^ //' -e 's/ $//' | grep -v '""' \ | sort | uniq -c | sort -nr830 days ago