<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/30214?offset=1470</link>
	<atom:link href="https://bioinformaticsonline.com/related/30214?offset=1470" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17885/international-conference-on-bioinformatics-models-methods-and-algorithms</guid>
	<pubDate>Sun, 05 Oct 2014 11:42:52 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17885/international-conference-on-bioinformatics-models-methods-and-algorithms</link>
	<title><![CDATA[International Conference on Bioinformatics Models, Methods and Algorithms]]></title>
	<description><![CDATA[<p><span>The purpose of the International Conference on Bioinformatics Models, Methods and Algorithms is to bring together researchers and practitioners interested in the application of computational systems and information technologies to the field of molecular biology, including for example the use of statistics and algorithms to understanding biological processes and systems, with a focus on new developments in genome bioinformatics and computational biology. Areas of interest for this community include sequence analysis, biostatistics, image analysis, scientific data management and data mining, machine learning, pattern recognition, computational evolutionary biology, computational genomics and other related fields.</span></p>
<p><span><span>Position Paper Submission Extension:</span><span>&nbsp;</span><span>October 9, 2014</span><span>&nbsp;</span><br><span>Regular Paper Authors Notification:</span><span>&nbsp;</span><span>November 3, 2014</span><span>&nbsp;</span><br><span>Position Paper Authors Notification:</span><span>&nbsp;</span><span>November 6, 2014</span><span>&nbsp;</span><br><span>Regular and Position Paper Camera Ready and Registration:</span><span>&nbsp;</span><span>November 17, 2014</span><span>&nbsp;</span></span></p><p>Address of the bookmark: <a href="http://www.bioinformatics.biostec.org/" rel="nofollow">http://www.bioinformatics.biostec.org/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools</guid>
	<pubDate>Tue, 16 Jul 2013 14:30:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools</link>
	<title><![CDATA[List of popular bioinformatics software/tools]]></title>
	<description><![CDATA[<p><a href="http://samtools.sourceforge.net/swlist.shtml">I</a>n current genome era, our day to day work is to handle the huge geneome sequences, expression data, several other datasets. This link provide a comprehensive list of commonly used sofware/tools.</p><p>Address of the bookmark: <a href="http://samtools.sourceforge.net/swlist.shtml" rel="nofollow">http://samtools.sourceforge.net/swlist.shtml</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</guid>
	<pubDate>Sat, 20 Mar 2021 00:28:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</link>
	<title><![CDATA[List of bioinformatics packages for NGS analysis !]]></title>
	<description><![CDATA[<p>Package suites gather software packages and installation tools for specific languages or platforms. We have some for bioinformatics software.</p><ul>
<li><a href="https://github.com/Bioconductor">Bioconductor</a>&nbsp;&ndash; A plethora of tools for analysis and comprehension of high-throughput genomic data, including 1500+ software packages. [&nbsp;<a href="https://link.springer.com/article/10.1186/gb-2004-5-10-r80">paper-2004</a>&nbsp;|&nbsp;<a href="https://www.bioconductor.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/biopython/biopython">Biopython</a>&nbsp;&ndash; Freely available tools for biological computing in Python, with included cookbook, packaging and thorough documentation. Part of the&nbsp;<a href="http://open-bio.org/">Open Bioinformatics Foundation</a>. Contains the very useful&nbsp;<a href="https://biopython.org/DIST/docs/api/Bio.Entrez-module.html">Entrez</a>&nbsp;package for API access to the NCBI databases. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/19304878">paper-2009</a>&nbsp;|&nbsp;<a href="https://biopython.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/bioconda">Bioconda</a>&nbsp;&ndash; A channel for the&nbsp;<a href="http://conda.pydata.org/docs/intro.html">conda package manager</a>&nbsp;specializing in bioinformatics software. Includes a repository with 3000+ ready-to-install (with&nbsp;<code>conda install</code>) bioinformatics packages. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/29967506">paper-2018</a>&nbsp;|&nbsp;<a href="https://bioconda.github.io/">web</a>&nbsp;]</li>
<li><a href="https://github.com/BioJulia">BioJulia</a>&nbsp;&ndash; Bioinformatics and computational biology infastructure for the Julia programming language. [&nbsp;<a href="https://biojulia.net/">web</a>&nbsp;]</li>
<li><a href="https://github.com/rust-bio/rust-bio">Rust-Bio</a>&nbsp;&ndash; Rust implementations of algorithms and data structures useful for bioinformatics. [&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/early/2015/10/06/bioinformatics.btv573.short?rss=1">paper-2016</a>&nbsp;]</li>
<li><a href="https://github.com/seqan/seqan3">SeqAn</a>&nbsp;&ndash; The modern C++ library for sequence analysis.</li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/1212/computational-proteomics-lets-remember-the-basics</guid>
	<pubDate>Thu, 01 Aug 2013 17:24:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/1212/computational-proteomics-lets-remember-the-basics</link>
	<title><![CDATA[Computational Proteomics : Lets remember the basics]]></title>
	<description><![CDATA[<p>I spend some of my valuable time in computational drug designing sector. I remember my initial proteomics days, playing with interactive protein visualization software and dreaming big. Fortunately or unfortunately, I switched to genomics and handling the genomic floods in Petabytes which is expected to be in Brontobytes in coming years. Did I mention Brontobytes ??? Let me call to my server personnel &hellip; it gonna tsunami !!!!!</p><p>Today, refreshing my old memories I decided to blog about the basic knowledge of biochemistry and computational proteomics&nbsp;skills, but after I found several article on internet saying exactly what I had wanted to say I thought I might as well just redirect BOL's blog readers there instead:</p><p>Here is the list of website and videos links which provide a good resource for you basic chemistry need:</p><p><a href="http://tecreativ.blogspot.co.uk/2012/09/funny-shortcut-remember-periodic-table.html"></a><a href="http://tecreativ.blogspot.co.uk/2012/09/funny-shortcut-remember-periodic-table.html"></a><a href="http://tecreativ.blogspot.co.uk/2012/09/funny-shortcut-remember-periodic-table.html"></a><a href="http://tecreativ.blogspot.co.uk/2012/09/funny-shortcut-remember-periodic-table.html">http://tecreativ.blogspot.co.uk/2012/09/funny-shortcut-remember-periodic-table.html</a></p><p>This blog have some specific hindi word to remember entire periodic table. I really like</p><p>Group 14 (C Si Ge Sn Pb) -&gt; Sentence &ldquo;<strong>C</strong>hemistry&nbsp;<strong>Si</strong>r&nbsp;<strong>G</strong>iv<strong>e</strong>s&nbsp;<strong>S</strong>a<strong>n</strong>ki&nbsp;<strong>P</strong>ro<strong>b</strong>lems&rdquo;</p><p>Sanki is a hindi word which mean crazy :P</p><p>I found this link useful as well&nbsp;<a href="http://www.wikihow.com/Memorise-the-Periodic-Table"></a><a href="http://www.wikihow.com/Memorise-the-Periodic-Table"></a><a href="http://www.wikihow.com/Memorise-the-Periodic-Table"></a><a href="http://www.wikihow.com/Memorise-the-Periodic-Table">http://www.wikihow.com/Memorise-the-Periodic-Table</a></p><p>The eagle genomics group provide an element of bioinformatics in periodic tables. Yes you got it, this is not periodic table rather bioinformatics tools with periodicals</p><p><a href="http://elements.eaglegenomics.com/"></a><a href="http://elements.eaglegenomics.com/"></a><a href="http://elements.eaglegenomics.com/"></a><a href="http://elements.eaglegenomics.com/">http://elements.eaglegenomics.com/</a></p><p>You can also try this video links, which provide you an overview with tricks on periodic tables:</p><p><a href="http://www.youtube.com/watch?v=fLSfgNxoVGk"></a><a href="http://www.youtube.com/watch?v=fLSfgNxoVGk"></a><a href="http://www.youtube.com/watch?v=fLSfgNxoVGk"></a><a href="http://www.youtube.com/watch?v=fLSfgNxoVGk">http://www.youtube.com/watch?v=fLSfgNxoVGk</a></p><p><a href="http://www.youtube.com/user/periodicvideos"></a><a href="http://www.youtube.com/user/periodicvideos"></a><a href="http://www.youtube.com/user/periodicvideos"></a><a href="http://www.youtube.com/user/periodicvideos">http://www.youtube.com/user/periodicvideos</a></p><p>For drug design educational material, software, tools, databses, viewer, file format and many more stuff at one place&nbsp;<a href="http://www.allfordrugs.com/drug-design/.%C2%A0I"></a><a href="http://www.allfordrugs.com/drug-design/"></a><a href="http://www.allfordrugs.com/drug-design/"></a><a href="http://www.allfordrugs.com/drug-design/">http://www.allfordrugs.com/drug-design/</a>&nbsp;I highly recommend you all computational drug designer to bookmark this page for future studies as well.</p><p>I just remember one of my mini project in which I use my flash knowledge (flash .. oh ya flash) to explain amino acids in interactive and user friendly manner. I can&rsquo;t provide It right now, but promise you to provide a link in near future. I hope that you will enjoy my flashy creative skills :).</p><p>Moreover, I found some of very interesting tricks to remember all amino acids chemical formulae on youtube at</p><p><a href="http://www.youtube.com/watch?v=gqrWb0fmzQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=gqrWb0fmzQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=gqrWb0fmzQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=gqrWb0fmzQ&amp;list=PL6132651E70BB5575">http://www.youtube.com/watch?v=gqrWb0fmzQ&amp;list=PL6132651E70BB5575</a></p><p><a href="http://www.youtube.com/watch?v=C2GfoGXfySQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=C2GfoGXfySQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=C2GfoGXfySQ&amp;list=PL6132651E70BB5575"></a><a href="http://www.youtube.com/watch?v=C2GfoGXfySQ&amp;list=PL6132651E70BB5575">http://www.youtube.com/watch?v=C2GfoGXfySQ&amp;list=PL6132651E70BB5575</a></p><p><br />Key points for computer added drug designers?<br />1. A shortage of biochemistry skills means that you absolutely nowhere in understanding the key concept and do research.<br />2. Keep handy with complex mathematical formula, before merely running tools or software.<br />3. Dig it better and deeper guys .. design it.</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/23121/senior-sas-programmer-urgent-role-permanant-welwyn-garden-city-uk</guid>
  <pubDate>Fri, 03 Jul 2015 08:14:23 -0500</pubDate>
  <link></link>
  <title><![CDATA[Senior SAS Programmer - URGENT ROLE - Permanant - Welwyn Garden City - UK]]></title>
  <description><![CDATA[
<p>SAS Programmer URGENTLY required !! My client is looking for an experienced Senior SAS Programmer, to join their bubbly dynamic team in Welwyn Garden City. You must have experience within SAS and/or R programming language. I am looking for someone with a background within either Life Sciences, Statistics, Computer Science, Bioinformatics etc. I am looking for someone with leadership qualities, you must have excellent analyst skills. Please call Dareen Evans on 01772 278050 or email your cv to dareen.evans@itworkshealth.co.uk</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/2780/life-of-bi</guid>
	<pubDate>Thu, 22 Aug 2013 16:13:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/2780/life-of-bi</link>
	<title><![CDATA[Life of BI !!!]]></title>
	<description><![CDATA[<p>Hmm .. Don't worry you read it right .. this is not pi but bi ... "life of Bioinformatician(BI)".&nbsp;</p><p><span>Disclaimer:</span>&nbsp;This cartoon is solely designed to create humour and fun, not to offend any PI, supervisor or student.</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/2780" length="63826" type="image/jpeg" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34552/edit-distance-application-in-bioinformatics</guid>
	<pubDate>Thu, 07 Dec 2017 08:46:51 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34552/edit-distance-application-in-bioinformatics</link>
	<title><![CDATA[Edit distance application in bioinformatics !]]></title>
	<description><![CDATA[<p>There are other popular measures of&nbsp;<a href="https://en.wikipedia.org/wiki/Edit_distance" title="Edit distance">edit distance</a>, which are calculated using a different set of allowable edit operations. For instance,</p><ul>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance" title="Damerau&ndash;Levenshtein distance">Damerau&ndash;Levenshtein distance</a>&nbsp;allows insertion, deletion, substitution, and the&nbsp;<a href="https://en.wikipedia.org/wiki/Transposition_(mathematics)" title="Transposition (mathematics)">transposition</a>&nbsp;of two adjacent characters;</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Longest_common_subsequence_problem" title="Longest common subsequence problem">longest common subsequence</a>&nbsp;(LCS) distance allows only insertion and deletion, not substitution;</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Hamming_distance" title="Hamming distance">Hamming distance</a>&nbsp;allows only substitution, hence, it only applies to strings of the same length.</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Jaro_distance" title="Jaro distance">Jaro distance</a>&nbsp;allows only&nbsp;<a href="https://en.wikipedia.org/wiki/Transposition_(mathematics)" title="Transposition (mathematics)">transposition</a>.</li>
</ul><p>&nbsp;</p><pre><span>use</span> Text<span>::</span>Levenshtein <span>qw</span><span>(</span>distance<span>);</span>

 <span>print</span> <span>distance</span><span>(</span><span>"foo"</span><span>,</span><span>"four"</span><span>);</span>
 <span># prints "2"</span>

 <span>my</span> <span>@words</span>     <span>=</span> <span>qw</span><span>/ four foo bar /</span><span>;</span>
 <span>my</span> <span>@distances</span> <span>=</span> <span>distance</span><span>(</span><span>"foo"</span><span>,</span><span>@words</span><span>);</span>

 <span>print</span> <span>"@distances"</span><span>;</span>
 <span># prints "2 0 3"</span><br /><br /><br /></pre><pre><span>use</span> Algorithm<span>::</span>LCSS <span>qw</span><span>(</span> LCSS CSS CSS_Sorted <span>);</span>
    <span>my</span> <span>$lcss_ary_ref</span> <span>=</span> <span>LCSS</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>  <span># ref to array</span>
    <span>my</span> <span>$lcss_string</span>  <span>=</span> <span>LCSS</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>    <span># string</span>
    <span>my</span> <span>$css_ary_ref</span> <span>=</span> <span>CSS</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>    <span># ref to array of arrays</span>
    <span>my</span> <span>$css_str_ref</span> <span>=</span> <span>CSS</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>      <span># ref to array of strings</span>
    <span>my</span> <span>$css_ary_ref</span> <span>=</span> <span>CSS_Sorted</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>  <span># ref to array of arrays</span>
    <span>my</span> <span>$css_str_ref</span> <span>=</span> <span>CSS_Sorted</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>    <span># ref to array of strings<br /><br /><br /><br /></span></pre><p>There are many different modules on CPAN for calculating the edit distance between two strings. Here's just a selection.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshteinXS">Text::LevenshteinXS</a>&nbsp;and&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3AXS">Text::Levenshtein::XS</a>&nbsp;are both versions of the Levenshtein algorithm that require a C compiler, but will be a lot faster than this module.</p><p>The Damerau-Levenshtein edit distance is like the Levenshtein distance, but in addition to insertion, deletion and substitution, it also considers the transposition of two adjacent characters to be a single edit. The module&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3ADamerau">Text::Levenshtein::Damerau</a>&nbsp;defaults to using a pure perl implementation, but if you've installed&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3ADamerau%3A%3AXS">Text::Levenshtein::Damerau::XS</a>&nbsp;then it will be a lot quicker.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3AWagnerFischer">Text::WagnerFischer</a>&nbsp;is an implementation of the Wagner-Fischer edit distance, which is similar to the Levenshtein, but applies different weights to each edit type.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ABrew">Text::Brew</a>&nbsp;is an implementation of the Brew edit distance, which is another algorithm based on edit weights.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3AFuzzy">Text::Fuzzy</a>&nbsp;provides a number of operations for partial or fuzzy matching of text based on edit distance.&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3AFuzzy%3A%3APP">Text::Fuzzy::PP</a>&nbsp;is a pure perl implementation of the same interface.</p><p><a href="http://search.cpan.org/perldoc?String%3A%3ASimilarity">String::Similarity</a>&nbsp;takes two strings and returns a value between 0 (meaning entirely different) and 1 (meaning identical). Apparently based on edit distance.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ADice">Text::Dice</a>&nbsp;calculates&nbsp;<a href="https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient">Dice's coefficient</a>&nbsp;for two strings. This formula was originally developed to measure the similarity of two different populations in ecological research.</p><pre><span>&nbsp;</span></pre>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/2041/uk-genome-science-meeting-sept-2nd-4th-2013</guid>
	<pubDate>Mon, 12 Aug 2013 12:03:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/2041/uk-genome-science-meeting-sept-2nd-4th-2013</link>
	<title><![CDATA[UK Genome Science Meeting, sept 2nd-4th, 2013]]></title>
	<description><![CDATA[<p>Following the success of the last three years' UK Next Gen Sequencing meetings at Nottingham, the newly named UK Genome Science meeting aims to bring together experts from around the world to meet and discuss the current and future state of all aspects and applications of Next Generation Sequencing.</p><p>More at &gt;&gt;&nbsp;<a href="http://www.nottingham.ac.uk/deepseq/events.aspx">http://www.nottingham.ac.uk/deepseq/events.aspx</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</guid>
	<pubDate>Tue, 06 Feb 2018 14:54:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</link>
	<title><![CDATA[Awk for Bioinformatician and computational biologist]]></title>
	<description><![CDATA[<p>Awk is a programming language which allows easy manipulation of structured data and is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then perform associated actions. The basic syntax is:</p><blockquote><p><br />awk '/pattern1/ {Actions}<br /> /pattern2/ {Actions}' file</p></blockquote><p><br />The working of Awk is as follows<br />Awk reads the input files one line at a time.<br />For each line, it matches with given pattern in the given order, if matches performs the corresponding action.<br />If no pattern matches, no action will be performed.<br />In the above syntax, either search pattern or action are optional, But not both.<br />If the search pattern is not given, then Awk performs the given actions for each line of the input.<br />If the action is not given, print all that lines that matches with the given patterns which is the default action.<br />Empty braces with out any action does nothing. It wont perform default printing operation.<br />Each statement in Actions should be delimited by semicolon.<br />Say you have data.tsv with the following contents:</p><p><br />$ cat data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />By default Awk prints every line from the file.</p><p><br />$ awk '{print;}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />We print the line which matches the pattern contig3</p><p><br />$ awk '/contig3/' data/test.tsv<br />contig3 ACTTATATATATATA<br />Awk has number of builtin variables. For each record i.e line, it splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 5 words, it will be stored in $1, $2, $3, $4 and $5. $0 represents the whole line. NF is a builtin variable which represents the total number of fields in a record.</p><p><br />$ awk '{print $1","$2;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p>$ awk '{print $1","$NF;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p><br />Awk has two important patterns which are specified by the keyword called BEGIN and END. The syntax is as follows:</p><blockquote><p>BEGIN { Actions before reading the file}<br />{Actions for everyline in the file} <br />END { Actions after reading the file }</p></blockquote><p><br />For example,<br />$ awk 'BEGIN{print "Header,Sequence"}{print $1","$2;}END{print "-------"}' data/test.tsv<br />Header,Sequence<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT<br />------- <br />We can also use the concept of a conditional operator in print statement of the form print CONDITION ? PRINT_IF_TRUE_TEXT : PRINT_IF_FALSE_TEXT. For example, in the code below, we identify sequences with lengths &gt; 14:</p><p>$ awk '{print (length($2)&gt;14) ? $0"&gt;14" : $0"&lt;=14";}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG&gt;14<br />contig2 ACTTTATATATT&lt;=14<br />contig3 ACTTATATATATATA&gt;14<br />contig4 ACTTATATATATATA&gt;14<br />contig5 ACTTTATATATT&lt;=14<br />We can also use 1 after the last block {} to print everything (1 is a shorthand notation for {print $0} which becomes {print} as without any argument print will print $0 by default), and within this block, we can change $0, for example to assign the first field to $0 for third line (NR==3), we can use:</p><p>$ awk 'NR==3{$0=$1}1' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT<br />You can have as many blocks as you want and they will be executed on each line in the order they appear, for example, if we want to print $1 three times (here we are using printf instead of print as the former doesn't put end-of-line character),</p><p>$ awk '{printf $1"\t"}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1 contig1<br />contig2 contig2 contig2<br />contig3 contig3 contig3<br />contig4 contig4 contig4<br />contig5 contig5 contig5 <br />Although, we can also skip executing later blocks for a given line by using next keyword:</p><p>$ awk '{printf $1"\t"}NR==3{print "";next}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2<br />contig3 <br />contig4 contig4<br />contig5 contig5</p><p>$ awk 'NR==3{print "";next}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2</p><p>contig4 contig4<br />contig5 contig5<br />You can also use getline to load the contents of another file in addition to the one you are reading, for example, in the statement given below, the while loop will load each line from test.tsv into k until no more lines are to be read:</p><p>$ awk 'BEGIN{while((getline k &lt;"data/test.tsv")&gt;0) print "BEGIN:"k}{print}' data/test.tsv<br />BEGIN:contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />BEGIN:contig2 ACTTTATATATT<br />BEGIN:contig3 ACTTATATATATATA<br />BEGIN:contig4 ACTTATATATATATA<br />BEGIN:contig5 ACTTTATATATT<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />You can also store data in the memory with the syntax VARIABLE_NAME[KEY]=VALUE which you can later use through for (INDEX in VARIABLE_NAME) command:</p><p>$ awk '{i[$1]=1}END{for (j in i) print j"&lt;="i[j]}' data/test.tsv<br />contig1&lt;=1<br />contig2&lt;=1<br />contig3&lt;=1<br />contig4&lt;=1<br />contig5&lt;=1</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/4003/personalised-medicine-animation</guid>
	<pubDate>Tue, 27 Aug 2013 10:07:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/4003/personalised-medicine-animation</link>
	<title><![CDATA[Personalised Medicine - Animation]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/fEY3Khsmuak" frameborder="0" allowfullscreen></iframe>Two animated case scenarios set now and in the future. These highlight potential differences in the way patients are treated now, and how they might be treated as healthcare becomes more tailored.]]></description>
	
</item>

</channel>
</rss>