<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/14221?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/14221?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</guid>
	<pubDate>Tue, 06 Feb 2018 14:54:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</link>
	<title><![CDATA[Awk for Bioinformatician and computational biologist]]></title>
	<description><![CDATA[<p>Awk is a programming language which allows easy manipulation of structured data and is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then perform associated actions. The basic syntax is:</p><blockquote><p><br />awk '/pattern1/ {Actions}<br /> /pattern2/ {Actions}' file</p></blockquote><p><br />The working of Awk is as follows<br />Awk reads the input files one line at a time.<br />For each line, it matches with given pattern in the given order, if matches performs the corresponding action.<br />If no pattern matches, no action will be performed.<br />In the above syntax, either search pattern or action are optional, But not both.<br />If the search pattern is not given, then Awk performs the given actions for each line of the input.<br />If the action is not given, print all that lines that matches with the given patterns which is the default action.<br />Empty braces with out any action does nothing. It wont perform default printing operation.<br />Each statement in Actions should be delimited by semicolon.<br />Say you have data.tsv with the following contents:</p><p><br />$ cat data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />By default Awk prints every line from the file.</p><p><br />$ awk '{print;}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />We print the line which matches the pattern contig3</p><p><br />$ awk '/contig3/' data/test.tsv<br />contig3 ACTTATATATATATA<br />Awk has number of builtin variables. For each record i.e line, it splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 5 words, it will be stored in $1, $2, $3, $4 and $5. $0 represents the whole line. NF is a builtin variable which represents the total number of fields in a record.</p><p><br />$ awk '{print $1","$2;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p>$ awk '{print $1","$NF;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p><br />Awk has two important patterns which are specified by the keyword called BEGIN and END. The syntax is as follows:</p><blockquote><p>BEGIN { Actions before reading the file}<br />{Actions for everyline in the file} <br />END { Actions after reading the file }</p></blockquote><p><br />For example,<br />$ awk 'BEGIN{print "Header,Sequence"}{print $1","$2;}END{print "-------"}' data/test.tsv<br />Header,Sequence<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT<br />------- <br />We can also use the concept of a conditional operator in print statement of the form print CONDITION ? PRINT_IF_TRUE_TEXT : PRINT_IF_FALSE_TEXT. For example, in the code below, we identify sequences with lengths &gt; 14:</p><p>$ awk '{print (length($2)&gt;14) ? $0"&gt;14" : $0"&lt;=14";}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG&gt;14<br />contig2 ACTTTATATATT&lt;=14<br />contig3 ACTTATATATATATA&gt;14<br />contig4 ACTTATATATATATA&gt;14<br />contig5 ACTTTATATATT&lt;=14<br />We can also use 1 after the last block {} to print everything (1 is a shorthand notation for {print $0} which becomes {print} as without any argument print will print $0 by default), and within this block, we can change $0, for example to assign the first field to $0 for third line (NR==3), we can use:</p><p>$ awk 'NR==3{$0=$1}1' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT<br />You can have as many blocks as you want and they will be executed on each line in the order they appear, for example, if we want to print $1 three times (here we are using printf instead of print as the former doesn't put end-of-line character),</p><p>$ awk '{printf $1"\t"}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1 contig1<br />contig2 contig2 contig2<br />contig3 contig3 contig3<br />contig4 contig4 contig4<br />contig5 contig5 contig5 <br />Although, we can also skip executing later blocks for a given line by using next keyword:</p><p>$ awk '{printf $1"\t"}NR==3{print "";next}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2<br />contig3 <br />contig4 contig4<br />contig5 contig5</p><p>$ awk 'NR==3{print "";next}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2</p><p>contig4 contig4<br />contig5 contig5<br />You can also use getline to load the contents of another file in addition to the one you are reading, for example, in the statement given below, the while loop will load each line from test.tsv into k until no more lines are to be read:</p><p>$ awk 'BEGIN{while((getline k &lt;"data/test.tsv")&gt;0) print "BEGIN:"k}{print}' data/test.tsv<br />BEGIN:contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />BEGIN:contig2 ACTTTATATATT<br />BEGIN:contig3 ACTTATATATATATA<br />BEGIN:contig4 ACTTATATATATATA<br />BEGIN:contig5 ACTTTATATATT<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />You can also store data in the memory with the syntax VARIABLE_NAME[KEY]=VALUE which you can later use through for (INDEX in VARIABLE_NAME) command:</p><p>$ awk '{i[$1]=1}END{for (j in i) print j"&lt;="i[j]}' data/test.tsv<br />contig1&lt;=1<br />contig2&lt;=1<br />contig3&lt;=1<br />contig4&lt;=1<br />contig5&lt;=1</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36603/learning-python-programming-a-bioinformatician-perspective</guid>
	<pubDate>Mon, 14 May 2018 16:33:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36603/learning-python-programming-a-bioinformatician-perspective</link>
	<title><![CDATA[Learning Python Programming - a bioinformatician perspective !]]></title>
	<description><![CDATA[<p>Python Programming&nbsp;is a general purpose programming language that is open source, flexible, powerful and easy to use. One of the most important features of python is its rich set of utilities and libraries for data processing and analytics tasks. In the current era of big biological data, python and biopython is getting more popularity due to its easy-to-use features which supports big data processing.</p><p>In this tutorial series article, I will explore features and packages of python which are widely used in the big data, NGS, and bioinformatics. I will also walk through a real biological example which shows NGS data processing with the help of python packages and programming.</p><p>Python has a couple of points to recommend it to biologists and scientists specifically:</p><ul>
<li>It's widely used in the scientific community</li>
<li>It has a couple of very well designed libraries for doing complex scientific computing (although we won't encounter them in this book)</li>
<li>It lend itself well to being integrated with other, existing tools</li>
<li>It has features which make it easy to manipulate strings of characters (for example, strings of DNA bases and protein amino acid residues, which we as biologists are particularly fond of)</li>
</ul><p>In general, following are some of the important features of python which makes it a perfect fit for rapid application development.</p><ul>
<li>Python is interpreted language so the program does not need to be compiled. Interpreter parses the program code and generates the output.</li>
<li>Python is dynamically typed, so the variables types are defined automatically.</li>
<li>Python is strongly typed. So the developers need to cast the type manually.</li>
<li>Less code and more use makes it more acceptable.</li>
<li>Python is portable, extendable and scalable.</li>
</ul><p>There are two major Python versions, Python 2 and Python 3. Python 2 and 3 are quite different. This tutorial uses Python 3, because it more semantically correct and supports newer features.</p><p>I will post tutorial on daily basis on this page. Check the sub-pages on right side.</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/2383/golden-rules-of-bioinformatics</guid>
	<pubDate>Wed, 14 Aug 2013 21:11:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/2383/golden-rules-of-bioinformatics</link>
	<title><![CDATA[Golden Rules of Bioinformatics]]></title>
	<description><![CDATA[<ol>
<li>All constant are variable.</li>
<li>Copy and paste is a genetic error.</li>
<li>First solve the problem, then write the code.</li>
<li>No matter what goes wrong, it will probably look right.</li>
<li>Any simple problem can be insoluble if enough metting are held to discuss it. :P</li>
<li>Stastics is a systematic method of comming to the wrong conclusion with confidence.</li>
<li>Bug is a undocumented feature in programming languages.</li>
<li>Good biological programmer goes on summer holiday with raincoat. [because see 1]</li>
<li>Thanks god Google know python is not a python and multiplication and division are the same thing.</li>
<li>Don' be clever, complex biology will trick you.</li>
</ol>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/8509/the-best-bioinformatics-computational-biology-quotes</guid>
	<pubDate>Wed, 26 Feb 2014 17:50:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/8509/the-best-bioinformatics-computational-biology-quotes</link>
	<title><![CDATA[The Best Bioinformatics / Computational Biology Quotes]]></title>
	<description><![CDATA[<p><img src="http://bioinformaticsonline.com/mod//photo/hahaha.png" style="border: 0; border: 0px;" alt="image"></p><p>Bioinformatician are not anti-social; We are just genome friendly.</p><p>Bioinformatician would love to change the biological world, but they won't give us the genetic code :P</p><p>If at first you don't succeed; call it version 1.0</p><p>The glass is neither half-full nor half-empty: it's actually have several genomes.</p><p>I'm BioGeek.</p><p>Fedup with LIPS, try God script.</p><p>Idiot, Go ahead, make my data!</p><p>Thank god, my genome just compiled.</p><p>Error message: "Out of space on genome drive:"</p><p>Shut up mobile elements, or i'll flush you out.</p><p>Never underestimate the internet bandwidth, u gotta incomplete.</p><p>Applied fuzzy logic to understand God's logic?</p><p>Warning! Overflow, delete chromosome !</p><p>Be nice to the BioGeek, for all you know they might be the next curator!</p><p>Beware of computational biologist they screw genes and protein.</p><p>Warning! Your genome is full of garbage, delete it !</p><p>Bad or missing mouse genome. Spank the cat? (Y/N)</p><p>Genome make very fast, very accurate mistakes.</p><p>Let's BLAST it.</p><p>Some genome never has transposons. It just develops random features.</p><p>Go watch CINEMA and have BLAST.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/23959/bioinformatics-jokes</guid>
	<pubDate>Fri, 21 Aug 2015 01:26:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/23959/bioinformatics-jokes</link>
	<title><![CDATA[Bioinformatics Jokes !!]]></title>
	<description><![CDATA[<p>Why was the Bioinformatics fired from his job?</p><p>A: He was getting too Sassy.</p><p>&nbsp;</p><p>What did the bioinformatician say when he found out his team stopped using version control?</p><p>A: Y&rsquo;all better Git!</p><p>&nbsp;</p><p>Why did the computational biologist stay home from work?</p><p>A: He had a code!</p><p>&nbsp;</p><p>Why was the bioinformatician's paper was rejected?</p><p>A: Journal thought it seemed scripted.</p><p>&nbsp;</p><p>How can you tell that a Bioinformatics is working?</p><p>A: You can hear him Grunting!</p><p>&nbsp;</p><p>Why bioinformatician always silence?</p><p>A: Because bioinformatician calmly whisper, &ldquo;SSH&rdquo;</p><p>&nbsp;</p><p>Why was the bioinformatician always so sleepy?</p><p>A: He/She wasn&rsquo;t given any Java.</p><p>&nbsp;</p><p>Why did the program/software hanged?</p><p>A: Because genome float.</p><p>&nbsp;</p><p>Why was the class upset that its parent died?</p><p>A: Because it wouldn&rsquo;t be getting the inheritance!</p><p>&nbsp;</p><p>Why did bioinformatician always works on the command line?</p><p>A: Because they don't want to scare you with huge amount of data!</p><p>&nbsp;</p><p>Why did the bioinformatician attend the gay pride parade?</p><p>A: They supported polymorphism.</p><p>&nbsp;</p><p>Why did bioinformatician prefer awk, PerlOneliner?</p><p>A: Because even computer can't handle to load the data.</p><p>&nbsp;</p><p>Why don&rsquo;t bioinformatician get along with others?</p><p>A: They&rsquo;re too MEAN.</p><p>&nbsp;</p><p>Why computational biologist are cool?</p><p>A: Because they are scripted!!</p><p>&nbsp;</p><p>Why they talk $ unzip; strip; touch; finger; grep; mount; fsck; more; yes; fsck; fsck; umount; clean; sleep;</p><p>A: Ah, Ohhh, dude, these are *NIX commands</p><p>&nbsp;</p><p>Did they really hack genome?</p><p>A: Yes, I guess so.</p>]]></description>
	<dc:creator>Jitendra Prajapati</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/23892/bioinformatics-made-easy-search-bioinformatics-tools-and-run-genomic-analysis-in-the-cloud</guid>
	<pubDate>Thu, 20 Aug 2015 02:21:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/23892/bioinformatics-made-easy-search-bioinformatics-tools-and-run-genomic-analysis-in-the-cloud</link>
	<title><![CDATA[Bioinformatics Made Easy Search: Bioinformatics tools and run genomic analysis in the cloud]]></title>
	<description><![CDATA[<p>InsideDNA makes hundreds of bioinformatics tools immediately available to run via an easy-to-use web interface and allows an accurate search across all functions, tools and pipelines.</p>
<p>With InsideDNA, you can upload and store your own genomic/genetic datasets in a limitless cloud space, and instantly analyze it with a powerful compute instance, without any tool installation or set up hassle.</p>
<p>More at https://insidedna.me/</p><p>Address of the bookmark: <a href="https://insidedna.me/" rel="nofollow">https://insidedna.me/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/42803/bioinformatician-purdue-cancer-center</guid>
  <pubDate>Wed, 03 Feb 2021 22:54:14 -0600</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatician - Purdue Cancer Center]]></title>
  <description><![CDATA[
<p>The Center for Cancer Research is an NCI-designated cancer center. The center is a catalyst for collaborative cancer research around Purdue University. In this role, the selected individual will have the opportunity to cooperate with Purdue faculty and students in performing cutting-edge research and analyses, with opportunities for professional development, and the possibility of co-authorship in faculty research publications. <br />Projects will be challenging, including various model organisms, and we are looking for an individual who is excited about interacting with multi-disciplinary cancer research groups and the development of new tools, techniques, and workflows. Independently perform both routine and project-specific analyses, advise faculty on the design of experiments, writing manuscripts for publication, and writing grant proposals. Interact and collaborate with bioinformatics services (i.e. Statistical Consulting Center to provide relevant services to the campus research community), where applicable. Support all of the bioinformatics activities of the Center for Cancer Research at Purdue University<br />Required:</p>

<p>Master's degree in bioinformatics, computer science, molecular biology, or related field<br />One year of experience in analyzing RNA-Seq data <br />In lieu of a degree, consideration will be given to an equivalent combination of related education and required work experience.<br />Understanding of molecular biology, biochemistry, and genetics<br />Proficiency in writing scripts using Perl, Python, Java, or equivalent languages<br />Proficiency in R and UNIX/LINUX <br />Knowledge of genomics, alignment, annotation, bioinformatics, concepts of sequence assembly<br />Highly motivated and detail-oriented<br />Ability, interest, and curiosity to learn new skills<br />Must possess strong communication skills to work effectively with users across disciplines<br />Ability to work independently and as part of a multi-disciplinary team<br />Strong visual, verbal, and written communication skills<br />Excellent time organizational skills<br />Preferred:</p>

<p>Experience writing software or building software pipelines<br />Experience with oncology-specific public databases including TCGA<br />Experience with deploying and/or running software on high-performance computational systems<br />Statistical and experimental design knowledge<br />Additional Information: </p>

<p>This position is contingent on the availability of funding<br />Purdue will not sponsor employment authorization for this position  <br />A background check will be required for employment in this position<br />FLSA: Exempt (Not Eligible For Overtime)<br />Retirement Eligibility: Defined Contribution Waiting Period <br />Purdue University is an EOE/AA employer. All individuals, including minorities, women, individuals with disabilities, and veterans are encouraged to apply</p>

<p>More at https://careers.purdue.edu/job/West-Lafayette-Bioinformatician-Purdue-Cancer-Center-IN-47906/686617600/</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/23119/senior-statistician-manchester-or-belfast-uk</guid>
  <pubDate>Fri, 03 Jul 2015 08:06:04 -0500</pubDate>
  <link></link>
  <title><![CDATA[Senior Statistician - Manchester or Belfast UK]]></title>
  <description><![CDATA[
<p>The Role</p>

<p>My client provide innovative biomarker discovery and development services to the pharmaceutical industry.  They partner with the pharmaceutical industry to develop and implement biomarker strategies, providing a full range of biomarker services from pre-clinical biomarker discovery, assay development, right through to the delivery of clinical tests in their CLIA lab.</p>

<p>As a Senior Statistician you would support this effort and be responsible for the management of technical experimental study design and data handling processes required for the discovery, development and commercial delivery of multiplex clinical diagnostic assays; You will:</p>

<p>Develop analytical experimental designs for multiplex clinical diagnostic assays in accordance with regulatory requirements (e.g. CLIA, FDA)<br />Lead and coordinate the evaluation of analytical studies including characterization, verification, and validation studies<br />Lead specification setting and specification alterations<br />Ensure DOE methodology is routinely used in analytical studies.<br />Work with the Operations Department to ensure robust, reproducible and precise assay development<br />Provide expertise of general aspects for Statistical Process Control<br />Provide statistical expertise for R&amp;D, Quality, and Manufacturing<br />You will work in a fast-paced, project orientated environment and the ability to plan and execute objectives under tight timelines is a must. This is a unique opportunity suited for a qualified statistician with an interest in working to deliver first class data analysis support and solutions in a clinical setting.</p>

<p>Requirements</p>

<p>MSc or PhD in statistics or a related discipline<br />In depth knowledge of DOE methods to analytically validate, monitor and trouble shoot multiplex clinical diagnostic assays, ideally in a commercial/industrial setting<br />Experienced in the analysis of statistical technology evaluation, independent data and dependent data analysis, medical diagnostic accuracy, statistical graphics and reproducible reporting.<br />Excellent interpersonal, communication (including written and spoken English)<br />Ability to independently manage multiple projects and to deliver results on time per project deadlines<br />Proficient programming and analysis skills in one or more statistical package (e.g. R, Stata, SAS)<br />The following skills, while not mandatory, are highly desirable:</p>

<p>Development and validation of predictive models<br />Experience of clinical epidemiology, survival analysis, biomarker research, Bayesian methods, quantifying predictive accuracy.<br />Knowledge of regulatory standards for CLIA and/or FDA IVD tests<br />Reward</p>

<p>An attractive remuneration package will reflect the importance of this role and will include 6.8 weeks annual leave (pro rata, including fixed closure days), company pension scheme, enhanced sick pay and maternity entitlements, healthcare plan and opportunities for learning and development, as well as access to a company restaurant and parking facilities</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/9207/biogeek-fun</guid>
	<pubDate>Sun, 16 Mar 2014 06:33:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/9207/biogeek-fun</link>
	<title><![CDATA[BioGeek Fun]]></title>
	<description><![CDATA[<p>1. A futuristic computational biology student was told to write "It is in my gene!!!" on the board 100 times as a punishment. here's his response -<br /><br />use warnings;<br />for ($count=1; $count &lt;=100; $count++) { print "It is in my gene!!!";}<br /><br />I guess, he is gonna to be a real biogeek. Nice try though. Smart kid.</p><p>&nbsp;</p><p>2. In some perl script I found this <br />&nbsp;. . . . . .<br />&nbsp;. . . . . .<br /># It works for me, only God understood how it is working<br />while (/(&lt;\/[^&gt;]+&gt;)|(&lt;[^&gt;]+&gt;)|(&lt;[^&gt;]+&gt;)$|([^&gt;&lt;]+)/go) {<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $startGene=$1;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $beginChromosome=$2;<br />&nbsp;&nbsp; &nbsp;<br />. . . . . .<br />&nbsp;.. . . . . .<br />}</p><p>&nbsp;</p><p>3. One more interesting message in Perl found &hellip;. It will must tickle you bone :) <br />open(my $fh, "&lt;", "gene.txt")&nbsp;&nbsp; &nbsp;or kill " Me if you think this is a mistake :$!";<br /><br /></p><p>&nbsp;</p><p>4. From the Perl <br /><br />&nbsp; while () {&nbsp; # "The Mothership Connection is here!"<br />&nbsp;&nbsp; &nbsp;print &ldquo;$_\n&rdquo;; # Printing the offspring :)</p><p>&nbsp;</p><p>5. Perl message<br />if ($1) { print &ldquo;Just found a the error in chromosome !!!, yahoo&hellip;&rdquo;; else { &ldquo;That is not error, but mutation you moron!&rdquo;;</p><p>&nbsp;</p><p>6. One genome database curator walk in wine bar asked the bartender:<br />CREATE TABLE gene IF NOT EXISTS SexOnTheBeach;</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/44845/a-bioinformatician%E2%80%99s-lament</guid>
	<pubDate>Thu, 29 May 2025 01:33:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/44845/a-bioinformatician%E2%80%99s-lament</link>
	<title><![CDATA[A Bioinformatician’s Lament]]></title>
	<description><![CDATA[<div><div dir="auto"><p><em>"I have a presentation tomorrow,"</em>&nbsp;they say,</p><p>With hopeful eyes, like it&rsquo;s all child's play.<br />As if results bloom overnight, full-grown&mdash;<br />Not wrangled from chaos, and error-prone.</p><p><strong>Oh brave soul, sit, let&rsquo;s walk through the tale,</strong><br />Of pipelines broken and servers that fail.<br />The journey starts: &ldquo;The data? It&rsquo;s there&mdash;<br />Just fetch it from S3, easy, I swear.&rdquo;</p><p>Now I summon&nbsp;<code>awscli</code>&nbsp;with dread,<br />Reset my keys, credentials fed.<br />Configure regions, IAM roles too&mdash;<br />All this, and still no peek at the view.</p><p>Next up, the tool: &ldquo;It&rsquo;s open source!&rdquo;<br />On GitHub, rotting, no sign of remorse.<br />Python 2.7, some GCC trick&mdash;<br />The install alone might make you sick.</p><p>Finally, progress! The pipeline runs&hellip;<br />Till RAM collapses and error stuns.<br />Oh, and the metadata? A crime,<br />Merged cells, font soup, out of time.</p><p>Sample IDs&mdash;what a cryptic game:<br /><code>Sample_1</code>,&nbsp;<code>S1</code>,&nbsp;<code>sample-1</code>... the same?<br />Controls mislabeled, cases flipped,<br />No wonder my sanity's starting to slip.</p><p>Then QC plots, PCA joy&mdash;<br />Wait, that&rsquo;s a tumor labeled as a boy?<br />Clusters cross, and axes lie,<br />And I still don&rsquo;t know&nbsp;<em>which</em>&nbsp;sample&rsquo;s "guy."</p><p>But the clock ticks on, and it&rsquo;s half-past doom,<br />They want the final UMAP soon.<br />With pastel colors, labeled clear&mdash;<br />"Can we move that legend to&nbsp;<em>right here</em>?"</p><p>Tweak by tweak, I adjust each frame,<br />Resize Panel B, annotate a name.<br />Export the plot&mdash;it starts to gleam&hellip;<br />Then my laptop crashes. I scream.</p><p>This is the grind, the long-haul game,<br />Where science hides behind code and flame.<br />No &ldquo;Export to Nature&rdquo; button to press,<br />Just toil and logic and hope for success.</p><p>So next time you whisper that fated line&mdash;<br />&ldquo;I have a talk, can you make it shine?&rdquo;<br />Know: bioinformatics is craft, not a click,<br />It&rsquo;s science with scars, not just a quick fix.</p><p><strong>To all who debug at 3AM light,</strong><br />Who ghostwrite figures through sleepless night&mdash;<br />You are the backbone, silent and true,<br />First-author-worthy, if only they knew.<br /><br /></p><hr><p><em><br />"कल मेरी प्रेज़ेंटेशन है,"</em>&nbsp;वो कहते हैं,</p></div></div><div><div dir="auto"><p>आशा भरी आँखों से, जैसे सब सहज है।<br />जैसे परिणाम रातोंरात प्रकट हो जाएं&mdash;<br />ना कि डेटा की भूलभुलैया से उखाड़े जाएं।</p><p><strong>आओ बैठो, एक किस्सा सुनाता हूँ,</strong><br />जहाँ पाइपलाइन टूटती है, और सर्वर भी थक जाते हैं।<br />कहानी शुरू होती है: &ldquo;डेटा तो है&mdash;<br />बस S3 बकेट में, एकदम पास में कहीं।&rdquo;</p><p>अब&nbsp;<code>awscli</code>&nbsp;बुलाता हूँ डरते हुए,<br />कुंजी सेट करूँ, क्रेडेंशियल जोड़ूं, रीजन भरूँ।<br />इतनी मशक्कत, फिर भी डेटा नहीं मिला,<br />बस सेटअप में ही पूरा दिन चला।</p><p>फिर आता है टूल: &ldquo;ओपन-सोर्स है!&rdquo;<br />GitHub पर है, 2019 से सूखा पड़ा है।<br />Python 2.7 चाहिए, एक पुराना कम्पाइलर,<br />और साथ में थोड़ी सी दुआ की ताकत।</p><p>आख़िरकार टूल चला, खुशी सी हुई,<br />लेकिन रन करते ही, मेमोरी ने हार मानी।<br />और मेटाडेटा? एक एक्सेल की आफ़त,<br />मर्ज़ किए हुए सेल, बस और क्या चाहिए काफ़ियत?</p><p>सैंपल आईडी? बस भगवान ही जाने&mdash;<br /><code>Sample_1</code>,&nbsp;<code>sample-1</code>,&nbsp;<code>S1</code>, और&nbsp;<code>control1</code>&mdash;<br />ये सब एक ही सैंपल हैं क्या?<br />पता तब चलता है जब पूछो दो-तीन बार।</p><p>काउंट मैट्रिक्स तैयार, अब R या Python की बारी,<br />QC करो, PCA प्लॉट&mdash;पर कुछ गड़बड़ भारी।<br />ट्यूमर और नॉर्मल का अदला-बदली खेल,<br />बार-बार, वही पुरानी झमेल।</p><p>आख़िर में आया मॉडलिंग का समय,<br />स्टैट्स, प्लॉट्स, डिफरेंशियल एक्सप्रेशन का श्रम।<br />लेकिन घड़ी में 5 बज चुके हैं जनाब,<br />और 8 बजे तक UMAP चाहिए, साफ़-सुथरा जबाब।</p><p>तो मैं कोड लिखता हूँ रात भर बैठ कर,<br />कलर पैलेट, जीन लेबल, लीजेंड बाहर रख कर।<br />फ़ॉन्ट, पैनल, एक्सिस सब सुधार,<br />एक्सपोर्ट करता हूँ... और लैपटॉप कहता है&mdash;"अब नहीं यार!"</p><p>इसीलिए बायोइन्फॉर्मेटिक्स में लगता है समय,<br />ये &ldquo;बस सीरत चलाओ&rdquo; या &ldquo;वोल्कैनो प्लॉट बनाओ&rdquo; नहीं है।<br />ये है सिस्टम एडमिन का काम, डेटा की सफ़ाई,<br />QC, डिबगिंग, और सांइस की सच्ची लड़ाई।</p><p><strong>तो कुछ सीखें इस व्यथा से आप भी आज:</strong><br />24 घंटे पहले चमत्कार मत माँगिए।<br />अच्छे फ़िगर साफ़ डेटा से बनते हैं।<br />बायोइन्फॉर्मेटिक्स जादू नहीं, विज्ञान है।<br />समय से बात कीजिए, प्रक्रिया का सम्मान कीजिए।</p><p><strong>और उन सभी बायोइन्फॉर्मेटिशियनों को सलाम,</strong><br />जो दूसरों की प्रेज़ेंटेशन के लिए रातों में जागते हैं&mdash;<br />तुम हो फ़िगर्स के भूत लेखक,<br />तुम हो बिना नाम के सह-लेखक।<br />तुम पहले लेखक बनने के हक़दार हो&mdash;<br />और एक लंबी नींद के भी।</p><p>Note: Written with the help of AI/LLM Tools !</p></div></div>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>

</channel>
</rss>