<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/14215?offset=410</link>
	<atom:link href="https://bioinformaticsonline.com/related/14215?offset=410" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/37590/parallel-processing-with-perl</guid>
	<pubDate>Sat, 25 Aug 2018 11:32:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/37590/parallel-processing-with-perl</link>
	<title><![CDATA[Parallel Processing with Perl !]]></title>
	<description><![CDATA[<p>Here is a small tutorial on how to make best use of multiple processors for bioinformatics analysis. One best way is using perl threads and forks. Knowing how these threads and forks work is very important before implementing them. Getting to know how these work would be really useful before reading this tutorial.</p><p>Many times in bioinformatics we need to deal with huge datasets which&nbsp; are more than 100GB size. The traditional way to analysis a file is using the while loop</p><p>while (FILE){</p><p>Do something;</p><p>}</p><p>This is very slow(since we are using only one processor) and if we have 500 million lines in the dataset it takes more than a day to iterate through the whole dataset. So how do we make best use of all our processors and get the work done quickly?</p><p>Here is a very simple and efficient technique with perl which i have been using. I am&nbsp; more inclined towards using perl fork than perl threads.</p><p>One of the oldest way to fork is</p><blockquote><p>my $fork = fork();<br />if($fork){&nbsp;&nbsp;&nbsp;<br />push (@childs,$fork);&nbsp;<br />}<br />elseif($fork==0){<br /><strong>your code here;</strong><br />exit(0);<br />}<br />else{die &ldquo;Couldnt fork : $!&rdquo;;}</p><p>## wait for the child process to finish<br />foreach(@childs){<br />my $tmp=waitid($_,0);<br />}</p></blockquote><p>what a fork does is it creates a child process and takes the variables and code with it to analyze it separately (detached from the parent process) and thus a separate process is created( which usually runs on a separate processor). Thats it!! One big disadvantage of forking is its very difficult to share variables among the different processes. I will show you how to do it easily but still it has its own drawbacks.</p><blockquote><p>Okie, now if you really do not want to use fork in your code, that&rsquo;s okie too..There are many useful modules which do it for you very efficiently. One really useful module is Parallel::ForkManager. You can use Parallel::ForkManager to manage the number of forks you want to generate (number of processors you want to use).</p><p><strong>Simple usage:</strong><br />use Parallel::ForkManager;<br />my $max_processors=8;<br />my $fork= new Parallel::ForkManager($max_processors);<br />foreach (@dna) {<br />$fork-&gt;start and next; # do the fork<br /><strong>you code here;</strong><br />$fork-&gt;finish; # do the exit in the child process<br />}<br />$pm-&gt;wait_all_children;</p></blockquote><p>so you will be generating 8 forks which do the same thing for your each element of array. when one child finishes, Parallel::ForkManager generates a new one and thus you will be using all your processors to analyze the data. Now, if you have generated 8 child processes and want to write the data to one file. You need to lock the file to do this, because you will have problems with the buffering. You can lock the file using flock command.</p><blockquote><p>open (my $QUAL, &ldquo;myfile.txt&rdquo;);<br />flock $QUAL, LOCK_EX or die &ldquo;cant lock file $!&rdquo;;<br />print $QUAL &ldquo;$output&rdquo;;<br />flock $QUAL, LOCK_UN or die &ldquo;$!&rdquo;;<br />close $QUAL;</p></blockquote><p>I would not suggest using flock when dealing with multiple processes because it will decrease the processing efficiency( each child process must wait for the lock to be released by the other child process). Instead, I would suggest each fork writing to a separate file and after the processing just concatenating them.</p><p><strong>Putting it all together, If you have 100GB data you can do this</strong></p><blockquote><p><strong>step 1</strong>&nbsp;: split the dataset equally according to number of processors you have. this may take a few hours(about 2-3 hrs for 100GB file)<br />You can use unix &ldquo;split&rdquo; command for this<br />for example:<br />my $number_split=int($number_of_entries_in_your_dataset/$max_processors);<br />my $split_Files=`split -l $number_split &ldquo;your_file.fasta&rdquo; &ldquo;file_name&rdquo;`;</p><p><strong>step2</strong>: open you directory comtaining you split files and start Parallel::ForkManager.<br /><strong>For example:</strong><br />opendir(DIRECTORY, $split_files_directory) or die $!; ### open the directory<br />my $fork= new Parallel::ForkManager($max_processors);<br />while (my $file = readdir(DIRECTORY)) { ### read the directory<br />if($file=~/^\./){next;}<br />print $file,&rdquo;\n&rdquo;;<br />########## Start fork ##########<br />my $pid= $super_fork-&gt;start and next;<br /><strong>Whatever you want to do with the split file ;</strong><br /><strong>analyze my piece of $file;</strong><br />######### end fork ###############<br />$super_fork-&gt;finish;<br />}<br />$super_fork-&gt;wait_all_children;</p></blockquote><p>So basically each processor will be active with its piece of data (split file) and thus you have created 8 processes at one time which run without interfering with the other process. I again will not suggest writing output from each child process to one file(for reasons above). Write output from each fork to a separate file and finally concatenate them. Thats it, you have just increased your program speed by 8 times!! Isnt it easy?</p><p><strong>Note:</strong><br />You may worry about concatenation of the output each child generates, since it does take some time(remember 100GB). I think now you can use a mysql database LOAD DATA LOCAL INFILE command to load all the files into a single table(Should take about 3hrs for 100Gb dataset) and then export the whole table into one file. This should be faster than just concatenating them using &ldquo;cat&rdquo; command.(correct me if I am wrong)</p><p>Or much simpler way is to use pipes</p><p>cat output_dir/* | my_pipe or my_pipe &lt;(file1) final_file;</p><p>Thats it guys!! Enjoy programming and please do comment. I am not a computer scientist so forgive me for any mistakes and if any please report them. Thank you.</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/2349/bioinformatics-understanding-of-living-systems-through-information-science</guid>
	<pubDate>Wed, 14 Aug 2013 11:50:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/2349/bioinformatics-understanding-of-living-systems-through-information-science</link>
	<title><![CDATA[Bioinformatics -- Understanding of living systems through  information science]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/6Ovd_GOM9-g" frameborder="0" allowfullscreen></iframe>Recently, the progress of the Human Genome Project, aiming to decode all human DNA sequences, has highlighted a research field called bioinformatics. In this new field, computers and techniques from information science are not just used as tools to advance life science research; they're expected to have a major impact on how we think about the life sciences.

Q. The main feature of bioinformatics is, it utilizes computers to analyze life. One is example is the genome. In all organisms, DNA contains genetic information, and this is called the genome. But the amount of information involved is huge, so recently, it's been read using next-generation sequencers, and analyzed by computers. In bioinformatics research, what we do is utilize those genome information to investigate the principles of life.

As an organism evolves, its genome sequence changes through sudden mutations. Additionally, at the genome level, mutations called rearrangements, such as inversions, transpositions, and duplications, occur. 

The genome comparison system developed by the Sakakibara Lab calculates homologous sequences called anchors, which are conserved between species. If the genome is considered as a long text, then anchors can be thought of as words.

Q. We're coming to understand the genomes of various organisms - not just humans, but monkeys, chimpanzees, bacteria, and so on. The first method used to analyze a genome is comparing it with the genomes of other organisms, to see where it's the same and where it's different. In that way, the content of the genome is decoded bit by bit, using computers. By contrast, in our method, we've developed software called Murasaki, which we also use to analyze large genomes, by comparing them with those of other organisms.

The Sakakibara Lab uses a next-generation sequencer at Keio University, along with a cluster machine with hundreds of CPUs. In this way, the Lab is analyzing genome mutations that cause cancer, and the genome of the natto production strain Bacillus subtilis.

Until now, genome analysis could only be done in national-scale projects. But now, next-generation sequencer development has made genome analysis possible in an ordinary lab. In a world-first achievement, the Sakakibara Lab has decoded the natto bacillus genome, through analysis using Keio's next-generation sequencer.

Q. In the future, biology and the life sciences may become almost entirely information science and computer science. And in healthcare, that may enable us, for example, to predict whether individuals are susceptible to cancer, or to certain lifestyle-related diseases, by understanding their personal genome data. So, I think it's amply possible that we can make use of such information effectively, to help people live longer and be free from disease, by thinking about their lifestyle habits.
 
Bioinformatics is only two decades old. In this field, many areas are still unknown. Professor Sakakibara, having been involved since the beginning, will continue tackling new, challenging research projects.]]></description>
	
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/38302/senior-bioinformatics-scientist-at-elucidata</guid>
  <pubDate>Tue, 27 Nov 2018 04:05:57 -0600</pubDate>
  <link></link>
  <title><![CDATA[Senior Bioinformatics Scientist at Elucidata]]></title>
  <description><![CDATA[
<p>Key Responsibilities <br />- Process and analyse metabolomic, transcriptional, genomics, proteomics <br />and any other kind of biological data. <br />- Interpret the data in the context of relevant biological literature to generate <br />actionable insights. <br />- Communicate the findings from data and literature to biologists and use the <br />biological insights to derive next steps/analyses. <br />- Communicate work through blogs, meet-ups, research papers, posters, etc. <br />- Identify, troubleshoot, and implement improvements to existing pipelines <br />and algorithms. <br />- Identify and implement new tools and pipelines to use for different types of <br />biological data. <br />- Work in a multi-disciplinary team with biologists, data scientists and data <br />analysts. <br />- Help with any other requirements (from database design to generating <br />prototypes for the product team).</p>

<p>Requirements <br />- 3-5 years of relevant bioinformatics experience such as public data mining, <br />processing, analysing and visualising omics data, etc. <br />- Ph.D., Masters or Bachelors in Bioinformatics, Biotechnology, <br />Computational Biology, or related field. <br />- Understanding of molecular biology and biochemistry. <br />- Comfort and experience with biological research and data. <br />- Proficient in a programming language used for bioinformatics such as R or <br />python. <br />- Excellent communication skills. <br />- Ability to summarise and simplify complex analyses for a non-technical <br />audience. <br />- Strong analytical skills, curiosity and a knack to solve difficult problems. <br />- Work well in multi-disciplinary teams with people of vastly different <br />backgrounds. <br />- Demonstrated success in collaboration and independent work.</p>

<p>More at https://angel.co/elucidata/jobs/460104-senior-bioinformatics-scientist</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/4547/bioinformatics-infrastructure-facility</guid>
  <pubDate>Sun, 15 Sep 2013 09:22:25 -0500</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatics Infrastructure Facility]]></title>
  <description><![CDATA[
<p>The Bioinformatics Infrastructure Facility has started working in the year 2007 at Presidency College, Kolkata. It is one of the premier institutes of India and boasts of a rich heritage and great alumni. The Infrastructure Facility has a dedicated team headed by Sayak Ganguli and ably supported by Priayanka Dhar. The coordinator of the facility is Abhijit Datta of the Post Graduate Department of Botany. The lab mainly focusses on the analysis of the RNA Induced Silencing Complex. Recent highlights include the presentation of a paper at the RNAi World Congress.</p>

<p>More @ http://bioinfo-presiuniv.edu.in/index.php</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39025/binc-exam-merged-with-dbt-bet-jrf-exam</guid>
	<pubDate>Thu, 21 Feb 2019 09:37:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39025/binc-exam-merged-with-dbt-bet-jrf-exam</link>
	<title><![CDATA[BINC Exam merged with DBT- BET JRF Exam]]></title>
	<description><![CDATA[<p>Another breaking news received has been received from the Department of biotechnology &ndash; DBT. As per a notification released by DBT, Bioinformatics National Certification (BINC) Exam conducted once per year by DBT has been now merged with DBT- BET JRF Exam.</p><p>Also, Bioinformatics Industrial Training Program (BIITP) is merged with the HRD Biotechnology Industrial Training Programme (BITP).</p><p>While this comes as a surprise for a lot of participants. We believe this is a good attempt to unify and create a national benchmark for talent. And we appreciate this endeavor from Department of biotechnology.</p><p>However, such last-minute announcements can create confusion. Thus candidates are advised to go through the complete notification DBT-BET JRF 2019 via the link below.If you have any kind of doubts, you must contact DBT JRF or Biotecnika for any kind of help &amp; assistance.</p><p><br />Attention:-Bioinformatics Programs (BINC and BIITP)</p><p>1. Bioinformatics National Certification (BINC) has been merged with DBT-Junior<br />Research Fellow (BET Exam)</p><p>2. Bioinformatics Industrial Training Program (BIITP) is merged with HRDBiotechnology Industrial Training Programme (BITP).</p><p>Students of Bioinformatics, who are interested to apply for Fellowship or Industrial<br />Training may keep track of the advertisement of DBT-JRF (BET Exam) and BITP<br />of DBT.</p><p>&nbsp;More at&nbsp;http://www.bcil.nic.in/files/Attention_Bioinformatics_Programs_(BINC_and_BIITP).pdf</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/2492/plos-computational-biology-translational-bioinformatics-educational-resources</guid>
	<pubDate>Fri, 16 Aug 2013 12:24:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/2492/plos-computational-biology-translational-bioinformatics-educational-resources</link>
	<title><![CDATA[PLOS Computational Biology: Translational Bioinformatics educational resources]]></title>
	<description><![CDATA[<p>PLOS present collection of Education articles:&nbsp; &ldquo;Translational Bioinformatics&rdquo;. This collection is presented as an online &ldquo;book&rdquo; which could serve as a reference tool for a graduate level introductory course, marking a step in an exciting new direction for the Education section of the journal.</p>
<p>Blog : http://blogs.plos.org/biologue/2012/12/28/translational-bioinformatics-plos-computational-biology-presents-an-educational-resource-for-an-emerging-field/</p>
<p>Educational Material : http://www.ploscollections.org/article/browseIssue.action?issue=info:doi/10.1371/issue.pcol.v03.i11</p><p>Address of the bookmark: <a href="http://www.ploscollections.org/article/browseIssue.action?issue=info:doi/10.1371/issue.pcol.v03.i11" rel="nofollow">http://www.ploscollections.org/article/browseIssue.action?issue=info:doi/10.1371/issue.pcol.v03.i11</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40235/bioinformatics-web-development-course</guid>
	<pubDate>Wed, 06 Nov 2019 20:42:48 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40235/bioinformatics-web-development-course</link>
	<title><![CDATA[Bioinformatics web development course]]></title>
	<description><![CDATA[<p>This web development course, targeted at Biology and Bioinformatics students, aims at teaching from scratch all the skills needed to setup a fully working Linux web server and to develop and deploy web applications for Bioinformatics.</p>
<p>No previous programming knowledge is assumed. By following this tutorial you will learn the fundamental concepts of programming by using scripting languages: variables, types, arrays, cycles, conditional statements, functions, objects, regular expressions, files reading and manipulation et-cetera.</p><p>Address of the bookmark: <a href="http://www.cellbiol.com/bioinformatics_web_development/introduction/" rel="nofollow">http://www.cellbiol.com/bioinformatics_web_development/introduction/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/2699/translational-bioinformatics-transforming-300-billion-points-of-data</guid>
	<pubDate>Tue, 20 Aug 2013 19:03:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/2699/translational-bioinformatics-transforming-300-billion-points-of-data</link>
	<title><![CDATA[Translational Bioinformatics: Transforming 300 Billion Points of Data]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/o4KNG7nd938" frameborder="0" allowfullscreen></iframe>Translational Bioinformatics: Transforming 300 Billion Points of Data into Diagnostics, Therapeutics, and New Insights into Disease      
      
Air date:  Wednesday, June 20, 2012, 3:00:00 PM
Time displayed is Eastern Time, Washington DC Local  
 
Description:  There is an urgent need to translate genome-era discoveries into clinical utility, but the difficulties in making bench-to-bedside translations haven't been well described. The nascent field of translational bioinformatics may help. Dr. Butte's lab at Stanford University builds and applies tools that convert more than 300 billion points of molecular, clinical, and epidemiological data (measured by researchers and clinicians over the past decade) into diagnostics, therapeutics, and new insights into disease. Dr. Butte, a bioinformatician and pediatric endocrinologist, will highlight his lab's work on using publicly available molecular measurements to find new uses for drugs, discovering new treatable mechanisms of disease in type 2 diabetes, and evaluating patients presenting with whole genomes sequenced. 

The NIH Wednesday Afternoon Lecture Series includes weekly scientific talks by some of the top researchers in the biomedical sciences worldwide. 

For more information, visit: 
The NIH Director's Wednesday Afternoon Lecture Series  
Author:  Atul Butte, M.D., Ph.D., Stanford University  
Runtime:  01:07:42  
Permanent link:  http://videocast.nih.gov/launch.asp?17321]]></description>
	
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/40945/the-clark-lab</guid>
  <pubDate>Fri, 07 Feb 2020 13:57:24 -0600</pubDate>
  <link></link>
  <title><![CDATA[The Clark Lab]]></title>
  <description><![CDATA[
<p>Study the process of Adaptive Evolution, during which species adopt novel traits to overcome challenges. We retrace the evolutionary histories of genomic elements to determine the changes underlying adaptation and to discover previously unknown genetic networks. These discoveries have already led to advances in human health, species conservation, and molecular biology. </p>

<p>More at http://clark.genetics.utah.edu/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/2741/bioinformatician-dreams</guid>
	<pubDate>Wed, 21 Aug 2013 10:50:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/2741/bioinformatician-dreams</link>
	<title><![CDATA[Bioinformatician Dreams]]></title>
	<description><![CDATA[<p>Bioinformatician life is interconnected, they always dream for a powerful server, little more space on server as they are generating lots of data per run, dream to publish results in good impact journals, meetings reminders :) and research analysis off course!!!&nbsp;</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/2741" length="557537" type="image/png" />
</item>

</channel>
</rss>