<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Most Commonly used Awk by Bioinformatician]]></title>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician?</link>
	<atom:link href="https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician</guid>
	<pubDate>Mon, 19 Aug 2013 01:12:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician</link>
	<title><![CDATA[Most Commonly used Awk by Bioinformatician]]></title>
	<description><![CDATA[<p style="text-align: center;">&nbsp;</p><p>Awk is a programming language that is specifically designed for quickly manipulating space delimited data. Although you can achieve all its functionality with Perl, awk is simpler in many practical cases.</p><p>Why awk? You can replace a pipeline of 'stuff | grep | sed | cut...' with a single call to awk. For a simple script, most of the timelag is in loading these apps into memory, and it's much faster to do it all with one. This is ideal for something like an openbox pipe menu where you want to generate something on the fly. You can use awk to make a neat one-liner for some quick job in the terminal, or build an awk section into a shell script. You can find a lot of online tutorials, but here I will only show a few examples which cover most of bioinformatician daily uses of awk.</p><p>choose rows where column 3 is larger than column 5:</p><p>awk '$3&gt;$5' input.txt &gt; output.txt</p><p>extract column 2,4,5:</p><p>awk '{print $2,$4,$5}' input.txt &gt; output.txt</p><p>awk 'BEGIN{OFS="\t"}{print $2,$4,$5}' input.txt</p><p>show rows between 20th and 80th:</p><p>awk 'NR&gt;=20&amp;&amp;NR&lt;=80' input.txt &gt; output.txt</p><p>calculate the average of column 2:</p><p>awk '{x+=$2}END{print x/NR}' input.txt</p><p>regex (egrep):</p><p>awk '/^test[0-9]+/' input.txt</p><p>calculate the sum of column 2 and 3 and put it at the end of a row or replace the first column:</p><p>awk '{print $0,$2+$3}' input.txt</p><p>awk '{$1=$2+$3;print}' input.txt</p><p>join two files on column 1:</p><p>awk 'BEGIN{while((getline&lt;"file1.txt")&gt;0)l[$1]=$0}$1 in l{print $0"\t"l[$1]}' file2.txt &gt; output.txt</p><p>count number of occurrence of column 2 (uniq -c):</p><p>awk '{l[$2]++}END{for (x in l) print x,l[x]}' input.txt</p><p>apply "uniq" on column 2, only printing the first occurrence (uniq):</p><p>awk '!($2 in l){print;l[$2]=1}' input.txt</p><p>count different words (wc):</p><p>awk '{for(i=1;i!=NF;++i)c[$i]++}END{for (x in c) print x,c[x]}' input.txt</p><p>deal with simple CSV:</p><p>awk -F, '{print $1,$2}'</p><p>substitution (sed is simpler in this case):</p><p>awk '{sub(/test/, "no", $0);print}' input.txt</p><p>&nbsp;</p><p>OK now here's where to read this stuff properly explained. roll</p><p>Two thorough tutorials:</p><p>http://www.gnu.org/software/gawk/manual/gawk.html</p><p>http://www.grymoire.com/Unix/Awk.html</p><p>A famous list of useful one-liners - though they're short, many are quite tricky:</p><p>http://www.pement.org/awk/awk1line.txt</p><p>And some nice explanations of those one-liners. After reading this you'll have a pretty good grasp!</p><p>http://www.catonmat.net/blog/awk-one-li &hellip; -part-one/</p><p>http://www.catonmat.net/blog/ten-awk-ti &hellip; -pitfalls/</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-2354</guid>
	<pubDate>Tue, 08 Mar 2016 11:09:38 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-2354</link>
	<title><![CDATA[Comment by Neel]]></title>
	<description><![CDATA[<p>Rename the name of multi fasta sequesnces with awk</p>
<p>awk '/^&gt;/{print "&gt;chromosome" ++i; next}{print}' &lt; file.fasta</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1677</guid>
	<pubDate>Sun, 21 Sep 2014 16:55:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1677</link>
	<title><![CDATA[Comment by John Parker]]></title>
	<description><![CDATA[<p>One of the best known cheat sheet for AWKians&nbsp;www.catonmat.net/download/awk.cheat.sheet.txt</p>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1450</guid>
	<pubDate>Sat, 31 May 2014 15:44:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1450</link>
	<title><![CDATA[Comment by Rahul Nayak]]></title>
	<description><![CDATA[<p>To deal with simple CSV: awk -F, '{print $1,$2}'<br><br></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1358</guid>
	<pubDate>Fri, 25 Apr 2014 20:23:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1358</link>
	<title><![CDATA[Comment by Shruti Paniwala]]></title>
	<description><![CDATA[<p>Some of the useful Unix onliner http://genomics-array.blogspot.in/2010/11/some-unixperl-oneliners-for.html</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1348</guid>
	<pubDate>Thu, 24 Apr 2014 21:33:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1348</link>
	<title><![CDATA[Comment by Alok Prajapati]]></title>
	<description><![CDATA[<p>Most commonly used Unix/Linux command for bioinformatics&nbsp; http://rous.mit.edu/index.php/Unix_commands_applied_to_bioinformatics</p>]]></description>
	<dc:creator>Alok Prajapati</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1306</guid>
	<pubDate>Mon, 07 Apr 2014 01:29:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1306</link>
	<title><![CDATA[Comment by Aaryan Lokwani]]></title>
	<description><![CDATA[<p>I love awk but recommend you to try <a href="https://github.com/lh3/bioawk">bioawk</a> . Bioawk is a modified version of awk which will parse some common sequence formats. https://github.com/lh3/bioawk</p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1207</guid>
	<pubDate>Mon, 10 Mar 2014 03:37:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1207</link>
	<title><![CDATA[Comment by Rahul Nayak]]></title>
	<description><![CDATA[<p>Print line of a tab-delimited file when the 8th field is 10090:<br><br>awk -F "\t" '$8 == 10090 { print $0 }' myFile<br><br>Print fields 1, 2, 3 from a tab-delimited file where the 4th field contains a '99':<br><br>awk -F "\t" '$4 ~ /99/ {print $1"\t"$2"\t"$3}' myFile</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1021</guid>
	<pubDate>Thu, 28 Nov 2013 18:44:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-1021</link>
	<title><![CDATA[Comment by Archana Malhotra]]></title>
	<description><![CDATA[<p>Some of the commonly used bioinformatics one-liner&nbsp;by Stephen Turner @ https://github.com/stephenturner/oneliners</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-907</guid>
	<pubDate>Sun, 10 Nov 2013 12:49:25 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-907</link>
	<title><![CDATA[Comment by Rahul Nayak]]></title>
	<description><![CDATA[<p>Awk, Linux, R tutorial by EMBL</p>
<p><a href="http://www.embl.de/~rausch/primer.pdf">http://www.embl.de/~rausch/primer.pdf</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-587</guid>
	<pubDate>Thu, 29 Aug 2013 08:38:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-587</link>
	<title><![CDATA[Comment by Poonam Mahapatra]]></title>
	<description><![CDATA[<p><span>To double space a file;&nbsp;<br></span>$ awk '1; { print "" }' :&nbsp;</p>
<p><span>To prints the number of words in a file;&nbsp;<br></span>$ awk '{ total = total + NF }; END { print total+0 }' :&nbsp;</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-525</guid>
	<pubDate>Fri, 23 Aug 2013 10:21:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-525</link>
	<title><![CDATA[Comment by Jitendra Narayan]]></title>
	<description><![CDATA[<p style="margin: 0px 0px 15px; padding: 0px; border: 0px; font-weight: normal; font-style: normal; font-size: 13px; vertical-align: baseline; color: #000000; text-align: left; background-color: #ffffff;">There is BioAwk, specially designed for bioinformatician by<span>&nbsp;</span><span style="margin: 0px; padding: 0px; border: 0px; font-weight: inherit; font-style: inherit; font-size: 13px; vertical-align: baseline;"><a href="https://github.com/ialbert">ialbert</a></span>.&nbsp; .... <a href="https://github.com/ialbert/bioawk-tools">https://github.com/ialbert/bioawk-tools</a></p>
<p style="margin: 0px; padding: 0px; border: 0px; font-weight: normal; font-style: normal; font-size: 13px; vertical-align: baseline; color: #000000; text-align: left; background-color: #ffffff;">Njoy</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-523</guid>
	<pubDate>Fri, 23 Aug 2013 10:15:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-523</link>
	<title><![CDATA[Comment by Archana Malhotra]]></title>
	<description><![CDATA[<p>Aakhsyan webpage explain commonly used sed and awk for bioinformatics at</p>
<p>http://raunakms.wordpress.com/2013/06/08/sed-and-awk-for-bioinformatics/</p>
<p>Handy OneLiner at&nbsp;http://bioinformatics.whatheblog.com/2010/03/handy-one-liners-awk/</p>
<p>BioUnix toolbox for bioinformatician</p>
<p>http://lh3lh3.users.sourceforge.net/biounix.shtml</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink='true'>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-522</guid>
	<pubDate>Fri, 23 Aug 2013 10:10:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2573/most-commonly-used-awk-by-bioinformatician#item-annotation-522</link>
	<title><![CDATA[Comment by Jitendra Narayan]]></title>
	<description><![CDATA[<p>This MIT wiki page demonstration shows how to perform some basic bioinformatics tasks using simple UNIX commands.</p>
<p>http://rous.mit.edu/index.php/Unix_commands_applied_to_bioinformatics</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

</channel>
</rss>