<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/2573?</link>
	<atom:link href="https://bioinformaticsonline.com/related/2573?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	
<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/43292/bioinformatics-scientist-production-bioinformatics-south-san-francisco-ca</guid>
  <pubDate>Thu, 19 Aug 2021 08:45:24 -0500</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatics Scientist, Production Bioinformatics @ South San Francisco, CA]]></title>
  <description><![CDATA[
<p>wist is looking for a Bioinformatics Scientist to join our Production Bioinformatics Team. You will work alongside research scientists, software engineers and data scientists to further deliver on our mission to expand access to best-in-class synthetic biology and next-generation sequencing applications. You will be developing and engineering tools to better evaluate and build hardened, production quality pipelines, optimize data quality, and automate lab and bioinformatics processes. Our ideal candidate is an organized problem solver with a background in developing and building novel production-quality bioinformatics tools and packages. Equally excellent communication skills and a proven ability to work independently are required.</p>

<p>More at https://boards.greenhouse.io/twistbioscience/jobs/3135495?gh_src=9ecc0b941us</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39606/amity-university-bioinformatics-summer-program-kolkata</guid>
	<pubDate>Tue, 11 Jun 2019 21:27:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39606/amity-university-bioinformatics-summer-program-kolkata</link>
	<title><![CDATA[Amity University Bioinformatics Summer Program - Kolkata]]></title>
	<description><![CDATA[<p>Registrations are now open for the 2019 Summer Bioinformatics Training program at Amity University, Kolkata. The program will focus on introductory topics for life science students. We will review important history, topics and challenges bioinformatics can help address in the context of basic research, discovery and industry.</p><p>Read more: https://edu.t-bio.info/amity-university-summer-bioinformatics-program-registrations-are-open/</p>]]></description>
	<dc:creator>eliabrodsky</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</guid>
	<pubDate>Tue, 06 Feb 2018 14:54:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</link>
	<title><![CDATA[Awk for Bioinformatician and computational biologist]]></title>
	<description><![CDATA[<p>Awk is a programming language which allows easy manipulation of structured data and is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then perform associated actions. The basic syntax is:</p><blockquote><p><br />awk '/pattern1/ {Actions}<br /> /pattern2/ {Actions}' file</p></blockquote><p><br />The working of Awk is as follows<br />Awk reads the input files one line at a time.<br />For each line, it matches with given pattern in the given order, if matches performs the corresponding action.<br />If no pattern matches, no action will be performed.<br />In the above syntax, either search pattern or action are optional, But not both.<br />If the search pattern is not given, then Awk performs the given actions for each line of the input.<br />If the action is not given, print all that lines that matches with the given patterns which is the default action.<br />Empty braces with out any action does nothing. It wont perform default printing operation.<br />Each statement in Actions should be delimited by semicolon.<br />Say you have data.tsv with the following contents:</p><p><br />$ cat data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />By default Awk prints every line from the file.</p><p><br />$ awk '{print;}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />We print the line which matches the pattern contig3</p><p><br />$ awk '/contig3/' data/test.tsv<br />contig3 ACTTATATATATATA<br />Awk has number of builtin variables. For each record i.e line, it splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 5 words, it will be stored in $1, $2, $3, $4 and $5. $0 represents the whole line. NF is a builtin variable which represents the total number of fields in a record.</p><p><br />$ awk '{print $1","$2;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p>$ awk '{print $1","$NF;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p><br />Awk has two important patterns which are specified by the keyword called BEGIN and END. The syntax is as follows:</p><blockquote><p>BEGIN { Actions before reading the file}<br />{Actions for everyline in the file} <br />END { Actions after reading the file }</p></blockquote><p><br />For example,<br />$ awk 'BEGIN{print "Header,Sequence"}{print $1","$2;}END{print "-------"}' data/test.tsv<br />Header,Sequence<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT<br />------- <br />We can also use the concept of a conditional operator in print statement of the form print CONDITION ? PRINT_IF_TRUE_TEXT : PRINT_IF_FALSE_TEXT. For example, in the code below, we identify sequences with lengths &gt; 14:</p><p>$ awk '{print (length($2)&gt;14) ? $0"&gt;14" : $0"&lt;=14";}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG&gt;14<br />contig2 ACTTTATATATT&lt;=14<br />contig3 ACTTATATATATATA&gt;14<br />contig4 ACTTATATATATATA&gt;14<br />contig5 ACTTTATATATT&lt;=14<br />We can also use 1 after the last block {} to print everything (1 is a shorthand notation for {print $0} which becomes {print} as without any argument print will print $0 by default), and within this block, we can change $0, for example to assign the first field to $0 for third line (NR==3), we can use:</p><p>$ awk 'NR==3{$0=$1}1' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT<br />You can have as many blocks as you want and they will be executed on each line in the order they appear, for example, if we want to print $1 three times (here we are using printf instead of print as the former doesn't put end-of-line character),</p><p>$ awk '{printf $1"\t"}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1 contig1<br />contig2 contig2 contig2<br />contig3 contig3 contig3<br />contig4 contig4 contig4<br />contig5 contig5 contig5 <br />Although, we can also skip executing later blocks for a given line by using next keyword:</p><p>$ awk '{printf $1"\t"}NR==3{print "";next}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2<br />contig3 <br />contig4 contig4<br />contig5 contig5</p><p>$ awk 'NR==3{print "";next}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2</p><p>contig4 contig4<br />contig5 contig5<br />You can also use getline to load the contents of another file in addition to the one you are reading, for example, in the statement given below, the while loop will load each line from test.tsv into k until no more lines are to be read:</p><p>$ awk 'BEGIN{while((getline k &lt;"data/test.tsv")&gt;0) print "BEGIN:"k}{print}' data/test.tsv<br />BEGIN:contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />BEGIN:contig2 ACTTTATATATT<br />BEGIN:contig3 ACTTATATATATATA<br />BEGIN:contig4 ACTTATATATATATA<br />BEGIN:contig5 ACTTTATATATT<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />You can also store data in the memory with the syntax VARIABLE_NAME[KEY]=VALUE which you can later use through for (INDEX in VARIABLE_NAME) command:</p><p>$ awk '{i[$1]=1}END{for (j in i) print j"&lt;="i[j]}' data/test.tsv<br />contig1&lt;=1<br />contig2&lt;=1<br />contig3&lt;=1<br />contig4&lt;=1<br />contig5&lt;=1</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/41905/research-associate-bioinformatics-in-iisc-recruitment-2020</guid>
  <pubDate>Tue, 23 Jun 2020 21:53:34 -0500</pubDate>
  <link></link>
  <title><![CDATA[Research Associate Bioinformatics in IISc Recruitment 2020]]></title>
  <description><![CDATA[
<p>Research Associate Bioinformatics in IISc Recruitment 2020</p>

<p>Essential Qualifications: Ph.D. (Bioinformatics/ Biophysics/ Biotechnology or any other stream of biological/ physical sciences) with a minimum of two publications in reputed peer reviewed journals in the area of structural bioinformatics or biophysics or biomolecular modeling/ simulation.</p>

<p>Job description: Development of bioinformatics tools and algorithms/software for structure based analysis of biomolecular systems. Programmatic access to major biomolecular databases using APIs Knowledge based prediction and analysis of biomolecular structure, function and interactions. Docking/simulations for inhibitor design.</p>

<p>Desirable Qualifications (Research Associate/s): i)  Strong computer programming skills (in Python/PERL/PHP or C++ or object oriented database management systems like MySQL etc or scripting languages under LINUX/UNIX environment). </p>

<p>ii) Extensive experience in computational analysis of biomolecular structure/interactions and usage of advanced biomolecular simulation softwares. iii) Adequate knowledge of major databases, webservers and softwares in the area of biomolecular structure/function and drug design. iv)  Familiarity with Parallel Programming environments and experience in usage of high-end HPC clusters.</p>

<p>The candidates must highlight their experience in above mentioned fields/topics in their CV. Initial appointment will be for a period of 1 year, subject to extension after review of performance.</p>

<p>Emoluments: As per DST, GOI norms and commensurate with experience.</p>

<p>More at https://www.iisc.ac.in/positions-open/</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/43272/bioinformatics-head-bioinformatics-manager-iii-cancer-genomics-research-laboratory-at-frederick-national-laboratory</guid>
  <pubDate>Wed, 18 Aug 2021 00:19:48 -0500</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatics Head (Bioinformatics Manager III), Cancer Genomics Research Laboratory at  Frederick National Laboratory]]></title>
  <description><![CDATA[
<p>Frederick National Laboratory seeking an enthusiastic, creative, and seasoned bioinformatics professional to join our leadership team and direct the exceptional Bioinformatics Group at the Cancer Genomics Research Laboratory (CGR).  CGR has a diverse team of bioinformatics and computational scientists that support all areas of bioinformatics and data analysis (infrastructure, data QC, pipeline development and maintenance, data curation and sharing, methodology development, statistical analyses, machine learning approaches, and scientific interpretation).</p>

<p>More at https://leidosbiomed.csod.com/ats/careersite/jobdetails.aspx?site=4&amp;c=leidosbiomed&amp;id=2040</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27463/bpipe-a-tool-for-running-and-managing-bioinformatics-pipelines</guid>
	<pubDate>Sat, 21 May 2016 22:42:16 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27463/bpipe-a-tool-for-running-and-managing-bioinformatics-pipelines</link>
	<title><![CDATA[Bpipe - a tool for running and managing bioinformatics pipelines]]></title>
	<description><![CDATA[<p>Bpipe provides a platform for running big bioinformatics jobs that consist of a series of processing stages - known as 'pipelines'.</p>
<ul>
<li>January 20th, 2016 - New! Bpipe 0.9.9 released!</li>
<li>Download <a href="http://download.bpipe.org/versions/bpipe-0.9.9.tar.gz">latest</a>, <a href="http://download.bpipe.org">all</a></li>
<li><a href="http://docs.bpipe.org">Documentation</a></li>
<li><a href="https://groups.google.com/forum/#%21forum/bpipe-discuss">Mailing List</a> (Google Group)</li>
</ul>
<p>Bpipe has been published in <a href="http://bioinformatics.oxfordjournals.org/content/early/2012/04/11/bioinformatics.bts167.abstract">Bioinformatics</a>! If you use Bpipe, please cite:</p>
<p><em>Sadedin S, Pope B &amp; Oshlack A, Bpipe: A Tool for Running and Managing Bioinformatics Pipelines, Bioinformatics</em></p><p>Address of the bookmark: <a href="http://docs.bpipe.org/" rel="nofollow">http://docs.bpipe.org/</a></p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</guid>
	<pubDate>Fri, 12 Jan 2018 03:49:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</link>
	<title><![CDATA[Converting FASTQ to FASTA]]></title>
	<description><![CDATA[<div id="block-system-main"><div><div><div><div><div><div><p>There are several ways you can convert fastq to fasta sequences. Some methods are listed below.</p><h3>Using SED</h3><p><span><code><span>sed</span></code></span>&nbsp;can be used to selectively print the desired lines from a file, so if you print the first and 2rd line of every 4 lines, you get the sequence header and sequence needed for fasta format.</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' INFILE.fastq &gt; OUTFILE.fasta
</pre><h3>Using PASTE</h3><p>You can linerize every 4 lines in a tabular format and print first and second field using&nbsp;<span><code>paste</code></span></p><pre>cat INFILE.fastq | paste - - - - |cut -f 1, 2| sed 's/@/&gt;/'g | tr -s "/t" "/n" &gt; OUTFILE.fasta
</pre><h3>EMBOSS:seqret</h3><p>Standard script that can be used for many purposes. One such use is fastq-fasta conversion</p><pre>seqret -sequence reads.fastq -outseq reads.fasta
</pre><p><span><code><span>awk</span></code></span>&nbsp;can be used for conversion as follows:</p><h3>Using AWK</h3><pre>cat infile.fq | awk '{if(NR%4==1) {printf("&gt;%s\n",substr($0,2));} else if(NR%4==2) print;}' &gt; file.fa
</pre><h3>FASTX-toolkit</h3><p><span><code>fastq_to_fasta</code></span>&nbsp;is available in the FASTX-toolkit that scales really well with the huge datasets</p><pre>fastq_to_fasta -h
usage: fastq_to_fasta [-h] [-r] [-n] [-v] [-z] [-i INFILE] [-o OUTFILE]
# Remember to use -Q33 for illumina reads!
version 0.0.6
       [-h]         = This helpful help screen.
       [-r]         = Rename sequence identifiers to numbers.
       [-n]         = keep sequences with unknown (N) nucleotides.
                   Default is to discard such sequences.
       [-v]         = Verbose - report number of sequences.
                   If [-o] is specified,  report will be printed to STDOUT.
                   If [-o] is not specified (and output goes to STDOUT),
                   report will be printed to STDERR.
       [-z]         = Compress output with GZIP.
       [-i INFILE]  = FASTA/Q input file. default is STDIN.
       [-o OUTFILE] = FASTA output file. default is STDOUT.
</pre><h3>Bioawk</h3><p>Another option to convert fastq to fasta format using&nbsp;<span><code>bioawk</code></span></p><pre>bioawk -c fastx '{print "&gt;"$name"\n"$seq}' input.fastq &gt; output.fasta
</pre><h3>Seqtk</h3><p>From the same developer, there is another option using a tool called&nbsp;<span><code>seqtk</code></span></p><pre>seqtk seq -a input.fastq &gt; output.fasta
</pre><p>Note that you can use either compressed or uncompressed files for this tool</p></div></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31353/concoct-clustering-contigs-with-coverage-and-composition</guid>
	<pubDate>Mon, 06 Mar 2017 04:08:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31353/concoct-clustering-contigs-with-coverage-and-composition</link>
	<title><![CDATA[CONCOCT: Clustering cONtigs with COverage and ComposiTion]]></title>
	<description><![CDATA[<p>A program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.</p>
<p>Warning! This software is to be considered under development. Functionality and the user interface may still change significantly from one version to another. If you want to use this software, please stay up to date with the list of known issues:<a href="https://github.com/BinPro/CONCOCT/issues">https://github.com/BinPro/CONCOCT/issues</a></p><p>Address of the bookmark: <a href="https://github.com/BinPro/CONCOCT" rel="nofollow">https://github.com/BinPro/CONCOCT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/87/linux-cheat-sheet</guid>
	<pubDate>Tue, 09 Jul 2013 17:30:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/87/linux-cheat-sheet</link>
	<title><![CDATA[Linux Cheat Sheet]]></title>
	<description><![CDATA[<p><span>In an attempt to find a good Linux reference for bioinformatician and BOL readers, I was unsuccessful at finding a decent one on the Internet. So, we decided to make a cheat sheet for biological programmers.</span></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/87" length="81260" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31382/seqmule-automated-human-exomegenome-variants-detection</guid>
	<pubDate>Tue, 07 Mar 2017 10:12:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31382/seqmule-automated-human-exomegenome-variants-detection</link>
	<title><![CDATA[SeqMule: Automated human exome/genome variants detection]]></title>
	<description><![CDATA[<p><span>SeqMule takes single-end or paird-end FASTQ or BAM files, generates a script consisting of more than 10 popular alignment, analysis tools and runs the script line by line. Users can change the pipeline or fine-tune the parameters by modifying its configuration file. SeqMule also has some built-in functions, such as pooling consensus calls from various callers, plotting a Venn diagram showing intersection among different callers, and downloading databases. SeqMule can be used for both Mendelian disease study and cancer genome study.</span></p><p>Address of the bookmark: <a href="http://seqmule.openbioinformatics.org/en/latest/" rel="nofollow">http://seqmule.openbioinformatics.org/en/latest/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>