Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Bash script to split multifasta file !

  • Public
By Neelam Jha 817 days ago
#Using awk, we can easily split a file (multi.fa) into chunks of size N (here, N=500), by using the following one-liner: awk 'BEGIN {n=0;} /^>/ {if(n%500==0){file=sprintf("chunk%d.fa",n);} print >> file; n++; next;} { print >> file; }' < multi.fa #OR awk -v chunksize=$(grep ">" multi.fasta -c) 'BEGIN{n=0; chunksize=int(chunksize/10)+1 } /^>/ {if(n%chunksize==0){file=sprintf("chunk%d.fa",n);} print >> file; n++; next;} { print >> file; }' < multi.fasta #Another great solution is genome tools (gt), which you can find here: http://genometools.org/, which has the following simple command: gt splitfasta -numfiles 10 multi.fasta