<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Bash script to split multifasta file !]]></title>
	<link>https://bioinformaticsonline.com/snippets/view/43765/bash-script-to-split-multifasta-file?</link>
	<atom:link href="https://bioinformaticsonline.com/snippets/view/43765/bash-script-to-split-multifasta-file?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43765/bash-script-to-split-multifasta-file</guid>
	<pubDate>Wed, 02 Feb 2022 03:53:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43765/bash-script-to-split-multifasta-file</link>
	<title><![CDATA[Bash script to split multifasta file !]]></title>
	<description><![CDATA[<code>#Using awk, we can easily split a file (multi.fa) into chunks of size N (here, N=500), by using the following one-liner:

awk &#039;BEGIN {n=0;} /^&gt;/ {if(n%500==0){file=sprintf(&quot;chunk%d.fa&quot;,n);} print &gt;&gt; file; n++; next;} { print &gt;&gt; file; }&#039; &lt; multi.fa

#OR

awk -v chunksize=$(grep &quot;&gt;&quot; multi.fasta -c) &#039;BEGIN{n=0; chunksize=int(chunksize/10)+1 } /^&gt;/ {if(n%chunksize==0){file=sprintf(&quot;chunk%d.fa&quot;,n);} print &gt;&gt; file; n++; next;} { print &gt;&gt; file; }&#039; &lt; multi.fasta

#Another great solution is genome tools (gt), which you can find here: http://genometools.org/, which has the following simple command:

gt splitfasta -numfiles 10 multi.fasta</code>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>