<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/23498?offset=1060</link>
	<atom:link href="https://bioinformaticsonline.com/related/23498?offset=1060" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/34916/bioinformatics-tools-developed-for-oxford-nanopore-data-analysis</guid>
	<pubDate>Wed, 27 Dec 2017 20:47:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/34916/bioinformatics-tools-developed-for-oxford-nanopore-data-analysis</link>
	<title><![CDATA[Bioinformatics tools developed for Oxford Nanopore data analysis !]]></title>
	<description><![CDATA[<p><span>MinION is the only portable real-time device for DNA and RNA&nbsp;</span><span>sequencing</span><span>. Each consumable flow cell can now generate 10&ndash;20 Gb of DNA&nbsp;</span><span>sequence</span><span>&nbsp;data. Ultra-</span><span>long read lengths are possible (hundreds of kb) as you can choose your fragment length.&nbsp;</span>One of the technical advantages of ONT data is the read length, which offers great prospects for genome assembly. Generally, assemblers are based on several different types of algorithms, such as greedy, overlap-layout-consensus (OLC), de Bruijn graph (DBG), and string graph.</p><p><span>List of analysis tools developed for Oxford Nanopore data</span></p><p>BWA <br />Fast nanopore data tuned alignment tool <br />https://github.com/lh3/bwa</p><p>GraphMap<br />Mapper for long and error-prone reads<br />https://github.com/isovic/graphmap</p><p>LAST<br />Nanopore tuned alignment tool<br />http://last.cbrc.jp/</p><p>LINKS<br />Software tool for long read scaffolding <br />https://github.com/warrenlr/LINKS/</p><p>marginAlign<br />Tools to align nanopore reads to a reference<br />https://github.com/benedictpaten/marginAlign</p><p>minoTour<br />Real time analysis tools<br />http://minotour.nottingham.ac.uk/</p><p>nanoCORR<br />Error-correction tool for nanopore sequence data<br />https://github.com/jgurtowski/nanocorr</p><p>NanoOK<br />Software for nanopore data, quality and error profiles<br />https://documentation.tgac.ac.uk/display/NANOOK/NanoOK</p><p>Nanopolish<br />Nanopore analysis and genome assembly software<br />https://github.com/jts/nanopolish</p><p>nanopore<br />Variant-detection tool for nanopore sequence data<br />https://github.com/mitenjain/nanopore</p><p>Nanocorrect<br />Error-correction tool for nanopore sequence data<br />https://github.com/jts/nanocorrect/</p><p>npReader<br />Real-time conversion and analysis of nanopore reads<br />https://github.com/mdcao/npReader</p><p>poRe<br />Tool for analyzing and visualizing nanopore data<br />https://sourceforge.net/p/rpore/wiki/Home/</p><p>PoreSeq<br />Error-correction and variant-calling software<br />https://github.com/tszalay/poreseq</p><p>Poretools<br />Nanopore sequence analysis and visualization software <br />https://github.com/arq5x/poretools</p><p>SSPACE-LongRead<br />Genome scaffolding tool <br />http://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE-longread</p><p>SMIS<br />Genome scaffolding tool <br />https://sourceforge.net/projects/phusion2/files/smis/</p><p>&nbsp;</p><p>List of assemblers for Oxford Nanopore MinION long reads</p><p>LQS<br />DALIGNER, Celera OLC Nanocorrect, <br />Nanopolish corrector<br />https://github.com/jts/nanopolish</p><p>PBcR<br />HGAP or BLASR, Celera OLC <br />PBcR corrector<br />http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR<br /> &ndash;<br />Canu<br />MHAP, Celera OLC <br />Canu corrector<br />https://github.com/marbl/canu</p><p>Falcon<br />String graph, Celera OLC <br />Falcon corrector<br />https://github.com/PacificBiosciences/falcon</p><p>Miniasm <br />OLC<br />https://github.com/lh3/miniasm</p><p>ra-integrate<br />OLC<br />https://github.com/mariokostelac/ra-integrate/</p><p>ALLPATHS-LG<br />de Bruijn graph <br />ALLPATHS-L corrector<br />https://www.broadinstitute.org/software/allpaths-lg/blog/?page_id=12</p><p>SPAdes <br />de Bruijn graph <br />SPAdes corrector<br />http://bioinf.spbau.ru/spades</p>]]></description>
	<dc:creator>biogeek</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/1332/bioinformatics-companies-in-india</guid>
	<pubDate>Mon, 05 Aug 2013 20:20:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/1332/bioinformatics-companies-in-india</link>
	<title><![CDATA[Bioinformatics Companies in India]]></title>
	<description><![CDATA[<p>Following are the list of top 30 bioinformatics companies in India. The companies name order does not follow any specific pattern.</p><p>1. Accelrys Software Solution Pvt Ltd.<br />12th Floor, Discover, ITPL, White Field, Bangalore-65.<br /><a href="http://www.accelrys.com/">www.accelrys.com</a></p><p>2. Apticraft Systems (P) Ltd.<br />142, Electronics Complex, Pardeshipura, Indore &ndash; 452010 (M.P.), India<br /><a href="http://www.apticraft.com/">www.apticraft.com</a></p><p>3. Aptuit Informatics<br />Plot No. 100-103, Export Promotion Industrial Park, White Field, Bangalore-560066<br /><a href="http://www.aptiuit.com/">www.aptiuit.com</a></p><p>4. Bigtec<br />J. K. Towers, 8th Block, Sangam Circle,46th Cross, Bangalore-560082.<br /><a href="http://www.bigtec.org/">www.bigtec.org</a></p><p>5. Bijam Biosciences Private Limited<br />Nagarjuna Hills, Hyderabad 500 082, India<br /><a href="http://www.nagarjunagroup.com/">www.nagarjunagroup.com</a></p><p>6. Bio Base Databases India Pvt Ltd.<br />Crescent Towers, 4th Floor, No : 32/1, Crescent Road, Bnagalore &ndash; 560 001<br /><a href="http://www.biobase-international.com/">www.biobase-international.com</a></p><p>7. BioImagene India Pvt. Ltd.<br />4th floor, C-Wing, Godrej Eternia, Shivajinagar, Pune-411005<br /><a href="http://www.bioimagene.com/">www.bioimagene.com</a></p><p>8. BioInformatics Institute Of India &ndash; Noida<br />C-56 A/28, Sector -62, Noida &ndash; 201 301<br /><a href="http://www.bii.in/">www.bii.in</a></p><p>9. CLC bio India Pvt Ltd<br />#Plot No. 51, H.No. 8-3-214/51, Srinivasa Nagar (West) Ameerpet Hyderabad &ndash; 500 038<br /><a href="http://www.clcbio.com/india">www.clcbio.com/india</a></p><p>10. CytoGenomics India (P) Ltd.<br />#3004, 12A Main HAL 2nd Stage, Bangalore 560008<br /><a href="http://www.silicocyte.com/">www.silicocyte.com</a></p><p>11. Genotypic Technology<br />211, 6th Cross, 80ft Road, RMV II Stage, Bangalore 560094<br /><a href="http://www.genotypic.co.in/">www.genotypic.co.in</a></p><p>12. Genvea Biosciences<br />Dr. D. T. Singh, CSO, 53, Craig Rd. #04-01, Singapore-089691<br /><a href="http://www.genvea.com/">www.genvea.com</a></p><p>13. Helix Info Systems<br />132 A, II Floor, Sterling Towers, IV Cross Street, Sterling Road, Nungambakkam, Chennai.<br /><a href="http://www.helixinfosystems.com/">www.helixinfosystems.com</a></p><p>14. Jalaja Technologies Pvt. Ltd.,<br />21/1,Victoria Layout, Victoria Road, Bangalore-47<br /><a href="http://www.jalaja.com/">www.jalaja.com</a></p><p>15. Jubilant Biosys Ltd<br />#96, Industrial Subrub, 2nd Stage, Yeshwanthpur, Bangalore- 560022<br />Jubilant Organosys Ltd.<br />1A, Sector 16A, Noida &ndash; 201 301 (India)<br /><a href="http://www.jubl.com/">www.jubl.com</a></p><p>16. Kshema Technologies<br />#1, Global Village, Mylasandra, Mysore Road, Bangalore-560 059.<br /><a href="http://www.mphasis.com/">www.mphasis.com</a></p><p>17. LabNetworx<br />B-704, Gitanjali Apartments, Vikas Marg Extension, New Delhi &ndash; 110 092<br /><a href="http://www.labnetworx.com/">www.labnetworx.com</a></p><p>18. LabVantage Solutions Pvt. Ltd.<br />Bengal Intelligent Park, Building C, 2nd Floor, Sector V, Salt Lake Electronics Complex, Kolkata &ndash; 700 091<br /><a href="http://www.labvantage.com/">www.labvantage.com</a></p><p>19. LeadInvent,&nbsp;<br />2nd Floor, Biotech Centre, University of Delhi, South Campus, Benito Juarez Road, New Delhi 110021, India<br />Contact no: +91 11 24119241<br />Email: contact@leadinvent.com<br /><a href="http://www.leadinvent.com">www.leadinvent.com</a></p><p>20. Mascon Life Sciences<br />B &ndash; 8/ 10, Vasant Vihar, New Delhi 110057, India<br /><a href="http://www.masconlifesciences.com/">www.masconlifesciences.com</a></p><p>21. Molecular Connections P Ltd<br />Kandala Mansion, 2/2 Kariappa Road, Near Krishna Rao Park, Basavangudi, Bangalore &ndash; 4<br /><a href="http://www.molecularconnections.com/">www.molecularconnections.com</a></p><p>22.Novo Informatics Pvt. Ltd.<br />TBIU, 2nd Floor, Synergy Building, Indian Institute of Technology,&nbsp;Hauz Khas, New Delhi-16.<br />Contact: 91-11-26581524, 91-11-26581766(Extension: 28)<br />Email: info@novoinformatics.com<br /><a href="http://www.novoinformatics.com">www.novoinformatics.com</a></p><p>23. Ocimum Biosolutions (India) Ltd<br />6th Floor, Reliance Classic, Road No.1 Banjara Hills, Hyderabad 500 034, India.<br /><a href="http://www.ocimumbio.com/">www.ocimumbio.com</a></p><p>24. Scube Scientific Software Solutions<br />613, Hemkunt Chambers, 89, Nehru Place, New Delhi -110 019<br /><a href="http://www.scribeindia.com/">www.scribeindia.com</a></p><p>25. Siri Technologies Pvt Ltd.<br />38/C -23, South End Road, Basavanagudi, Bangalore-56004.<br /><a href="http://www.siritech.com/">www.siritech.com</a></p><p>26. Strand Life Sciences Pvt. Ltd.<br />#237, Sir C. V. Raman Avenue, Raj Mahal Vilas, Bangalore 560 080 INDIA<br /><a href="http://www.strandls.com/">www.strandls.com<br /></a><br />27. SooryaKiran Bioinformatics (P) Ltd<br />TBIC-13, Tejaswini Building, Technopark, Thriruvananthapuram- 695 584, Keralam, India</p><p>Ph: +91 471 4060979,+91 9895404104<br />Email:&nbsp;<a href="mailto:reachus@sooryakiran.com">reachus@sooryakiran.com</a><br /><a href="http://www.sooryakiran.com/">http://www.sooryakiran.com</a></p><p>28. Systat Software Asia Pacific<br />4th Floor, Block 1, Shankar Narayan Building, No.25, MG Road, Bangalore &ndash; 560001<br /><a href="http://www.systat.com/">www.systat.com</a></p><p>29. ABC Genomics (India) Pvt. Ltd.<br />Biotech Park, Sector G, Jankipuram, Kursi Road, Lucknow-226021, U.P., INDIA<br />Tel +91-522-4068579, Email: director@abcgenomics.com<br /><a href="http://www.abcgenomics.com/">www.abcgenomics.com</a></p><p>30. en-GENE-ier's Core Technology Services,<br />1/340, Virat Khand, Gomtinagar,&nbsp;<br />(Near Maharaja Agrasen Public School)<br />lucknow-226010, U.P., India.<br /><a href="http://www.bio.egicore.com/"></a><a href="http://www.bio.egicore.com/">http://www.bio.egicore.com/</a></p><p>&nbsp;</p><p>Best of luck for your job hunts :).</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/35552/the-brent-lab</guid>
  <pubDate>Fri, 09 Feb 2018 10:55:27 -0600</pubDate>
  <link></link>
  <title><![CDATA[The Brent Lab]]></title>
  <description><![CDATA[
<p>The Brent Lab is developing and applying computational methods for mapping gene regulation networks, modeling them quantitatively, and engineering new behaviors into them.</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/1490/bioinformatics-jrf-at-iiser-mohali</guid>
  <pubDate>Thu, 08 Aug 2013 15:56:02 -0500</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatics JRF at IISER MOHALI]]></title>
  <description><![CDATA[
<p>Applications are invited for a Junior Research Fellow (JRF) in Innovative Young Biotechnologist Award (IYBA) research project funded by Department of Biotechnology (DBT).</p>

<p>The project involves identification and characterization of transcription factors (TFs) from the Arabidopsis shoot apical meristem stem cell niche using genomic approaches and construction of a gene regulatory network for the identified TFs.</p>

<p>Positions: 1</p>

<p>Duration: 1 year but extendable up to three years based on performance and availability of funds.</p>

<p>Emoluments: As per DST rules.</p>

<p>Essential Qualifications: M.Sc. in any branch of life sciences with excellent academic record with CSIR-UGC NET or DBT-JRF. Candidate having previous work experience in the area of bioinformatics, molecular biology and genetics is preferred, but not required.</p>

<p>How to Apply: Applicants are requested to send a cover letter outlining previous research experiences and reasons for joining this position. Please send your complete bio-data including the cover letter as PDF attachment by email to Dr. Ram Yadav at ryadav@iisermohali.ac.in</p>

<p>Last date of submission is 17.00 IST, August 10, 2013.</p>

<p>Advertisement: www.iisermohali.ac.in/project_openings.html#29</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</guid>
	<pubDate>Tue, 10 Apr 2018 04:13:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</link>
	<title><![CDATA[Bioinformatics OneLiner]]></title>
	<description><![CDATA[<p>To remove all line ends (\n) from a Unix text file:</p><pre>sed ':a;N;$!ba;s/\n//g' filename.txt &gt; newfilename_oneline.txt</pre><p>To get average for a column of numbers (here the second column $2):</p><pre>awk '{ sum += $2; n++ } END { if (n &gt; 0) print sum / n; }'</pre><p>To get sequence length for all sequences in a fasta file:</p><pre>awk '/^&gt;/ {if (seqlen){print seqlen}; print ;seqlen=0;next; } { seqlen = seqlen +length($0)}END{print seqlen}' \<br />filename.fasta</pre><p>To copy (move, rename, etc) files based on their list in a text file:</p><pre>cat file_list.txt | while read line; do cp "$line" complete_dataset/"$line"; done</pre><p>To split bam files into sets with mapped and unmapped reads:</p><pre>samtools view -F4 sample.bam &gt; sample.mapped.sam<br />samtools view -f4 sample.bam &gt; sample.unmapped.sam</pre><p>To gzip all your fastq files using gnu parallel and gzip:</p><pre>parallel gzip ::: *.fastq</pre><p>To gzip all your fastq files using pigz:</p><pre>pigz *.fastq</pre><p>To count all sequences in a fasta file:</p><pre>grep "^&gt;" yourfile.fasta -c</pre><p>To count all sequences in all fasta files in your current directory:</p><pre>for a in *.fasta; do ls $a; grep "^&gt;" -c $a; done</pre><p>To keep only one copy of duplicated lines:</p><pre>awk '!seen[$0]++'</pre><p>To sum assembly size from SPAdes contigs.fasta or scaffolds.fasta file:</p><pre>grep "^&gt;" scaffolds.fasta | cut -f 4 -d '_' | paste -sd+ | bc</pre><p>To remove everything after the first space at each line, e.g. to to simplify fasta headers:</p><pre>cut -d' ' -f1 &lt; your_file</pre><p>To count reads in a all .fastq.gz files in your current folder (fast, using gnu parallel):</p><pre>parallel "echo {} &amp;&amp; gunzip -c {} | wc -l | awk '{d=\$1; print d/4;}'" ::: *.gz</pre><p>To count reads in a all .fastq.gz files in your current folder:</p><pre>zcat *.gz | echo $((`wc -l`/4))</pre><p>To count reads in a all .fastq files in your current folder:</p><pre>cat *.fastq | echo $((`wc -l`/4))</pre><p>To count base pairs in a all .fastq.gz files in your current folder:</p><pre>zcat *.fastq.gz | paste - - - - | cut -f 2 | tr -d '\n' | wc -c </pre><p>To split multifasta file into many fasta files:</p><pre>awk '/^&gt;/ {OUT=substr($0,2) ".fa"}; {print &gt;&gt; OUT; close(OUT)}' Input_File</pre><p>To convert Illumina FASTQ 1.3 to 1.8:</p><pre>sed -e '4~4y/@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghi/!"#$%&amp;'\''()*+,-.\/0123456789:;&lt;=&gt;?@ABCDEFGHIJ/' f.fastq</pre><p>To convert FASTQ to FASTA:</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' </pre><p>To get fastq read length distribution:</p><pre>cat reads.fastq | awk '{if(NR%4==2) print length($1)}' | sort | uniq -c</pre><p>To deinterleave interleaved fastq file:</p><pre>cat myf.fq | paste - - - - - - - - | tee &gt;(cut -f 1-4 | tr "\t" "\n" &gt; myfile_1.fq) | cut -f 5-8 | \<br />tr "\t" "\n" &gt; myf2.fq </pre><p>To filter and sort contig identifiers from SPAdes assembly (e.g. here lenght &gt;= 4000 + coverage &gt;=100):</p><pre>grep "^&gt;" scaffolds.fasta | sed s"/_/ /"g | awk '{ if ($4 &gt;= 4000 &amp;&amp; $6 &gt;= 100) print $0 }' | sort -k 4 -n | \<br />sed s"/ /_/"g</pre><p>To append something to all headers of your fasta files:</p><pre>sed 's/&gt;.*/&amp;YOURSTRING/' filename.fasta &gt; new_filename.fasta</pre><p>To replace/squeeze multiple adjacent spaces by only one space:&nbsp;</p><pre>tr -s " " &lt; file</pre><p>To filter fastq based on length (here larger than or equal to 21, but smaller than or equal to 25.</p><pre>cat your.fastq | paste - - - - | awk 'length($2)&nbsp; &gt;= 21 &amp;&amp; length($2) &lt;= 25' | sed 's/\t/\n/g' &gt; filtered.fastq</pre><p>To print difference between the last and first row in 5th column:</p><pre>awk '{if (!first){first=$5;}; last=$5;} END {print last-first}' myfile.txt</pre><p>To sample only 200 first bases from all sequences in a multifasta file (e.g. from assembly scaffolds.fasta file here):</p><pre>awk '/^&gt;/{ seqlen=0; print; next; } seqlen &lt; 200 { if (seqlen + length($0) &gt; 200) $0 = substr($0, 1, 200-seqlen);\<br /> seqlen += length($0); print }' scaffolds.fasta &gt; 200bp_scaffolds.fasta</pre><p>&nbsp;To pipe a compressed fasta file directly into makeblastdb.</p><pre>gunzip -c fasta.gz | makeblastdb -in -</pre><p>To remove sequences with duplicate fasta headers from a fasta file.</p><pre>awk '/^&gt;/{f=!d[$1];d[$1]=1}f' in.fasta &gt; out.fasta</pre>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/1515/list-of-pharmacogenomics-companies-in-india</guid>
	<pubDate>Fri, 09 Aug 2013 13:26:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/1515/list-of-pharmacogenomics-companies-in-india</link>
	<title><![CDATA[List of pharmacogenomics companies in India]]></title>
	<description><![CDATA[<p>pharmacogenomics companies in India are making their good impacts. Here is the list of few pharmacogenomics companies. Please add more if not mentioned here.</p><p>Genomics in India <br /><a href="http://www.ganitlabs.in/">www.ganitlabs.in</a> <br /><a href="http://www.sandor.co.in/">www.sandor.co.in</a> <br /><a href="http://www.igib.res.in/">www.igib.res.in</a> <br /><a href="http://www.genotypic.co.in/">www.genotypic.co.in</a> <br /><a href="http://www.ocimumbio.com/">www.ocimumbio.com</a> <br /><a href="http://www.abcgenomics.com/">www.abcgenomics.com</a> <br /><a href="http://www.xcelrisgenomics.com/">www.xcelrisgenomics.com</a> <br /><a href="http://www.ayugen.com/">www.ayugen.com</a> <br /><a href="http://www.geneombiotech.com/">www.geneombiotech.com</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</guid>
	<pubDate>Wed, 25 Apr 2018 04:35:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</link>
	<title><![CDATA[Binding Site Prediction in Protein !]]></title>
	<description><![CDATA[<p><span>The interaction between proteins and other molecules is fundamental to all biological functions. In this section we include tools that can assist in prediction of interaction sites on protein surface and tools for predicting the structure of the intermolecular complex formed between two or more molecules (docking).</span></p><h4>Pockets Identification</h4><p><a href="http://sts.bioengr.uic.edu/castp/" target="_blank">CASTp</a></p><div style="text-align: justify;">Automatic Identification of pockets and cavities in proteins structure, and quantitation of their volumes using Delaunay triangulation. Available also as PyMOL plugin</div><p><a href="http://www.bioinformatics.leeds.ac.uk/pocketfinder/" target="_blank">Pocket-Finder</a></p><div style="text-align: justify;">Automatic identification of pockets and cavities in proteins structure, and quantitation of their volumes.</div><p><a href="http://gecco.org.chemie.uni-frankfurt.de/pocketpicker/index.html" target="_blank">PocketPicker</a></p><div style="text-align: justify;">Grid-based technique for the analysis of protein pockets. PocketPicker available as a plugin for&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/pymol.htm">PyMOL</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><h4>Binding Site Prediction</h4>
<p><a href="http://consurf.tau.ac.il/" target="_blank">ConSurf</a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification of functional regions in proteins by surface-mapping of phylogenetic information</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www-cryst.bioc.cam.ac.uk/~crescendo/crescendo.php" target="_blank">CRESCENDO</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification protein interaction sites. It uses sequence conservation patterns in homologous proteins to distinguish between residues that are conserved due to structural restraints from those due to functional restraints.&nbsp;&nbsp;</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><strong>Ligand Binding Sites</strong></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www.sbg.bio.ic.ac.uk/~3dligandsite/" target="_blank">3DLigandSite</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">The server utilizes protein-structure prediction to provide structural models of the binding site. Ligands bound to structures are superimposed onto the model and use to predict the binding site.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">F<a href="http://cssb.biology.gatech.edu/skolnick/files/FINDSITE/" target="_blank">INDSITE</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">A threading-based method for ligand-binding site prediction and functional annotation based on binding-site similarity across superimposed groups of threading templates.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">
<p><a href="http://scoppi.biotec.tu-dresden.de/pocket/" target="_blank">LIGSITE<sup>csc</sup></a></p>
<div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Prediction of binding site by pocket identification using the Connolly surface and degree of conservation</div>
<p><a href="http://metapocket.eml.org/" target="_blank"></a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://metapocket.eml.org/" target="_blank">metaPocket</a>A meta server for ligand-binding site prediction. metaPocket use&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#ligsite">LIGSITE<sup>csc</sup></a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#pass">PASS</a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#qsite">Q-SiteFinder</a>&nbsp;and&nbsp;<a href="http://www.biochem.ucl.ac.uk/~roman/surfnet/surfnet.html" target="_blank">SURFNET</a></div>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/2001/the-ontario-institute-for-cancer-research-oicr-genomics-lab-toronto-canada</guid>
  <pubDate>Mon, 12 Aug 2013 01:43:13 -0500</pubDate>
  <link></link>
  <title><![CDATA[The Ontario Institute for Cancer Research (OICR) Genomics Lab , Toronto, Canada.]]></title>
  <description><![CDATA[
<p>The Human Genome Project led to the development of a wide array of technologies to screen the genome and its products (genes, proteins, metabolites) and molecules that interact with these products (chemicals, RNAi). The existence of these tools resulted in the creation of facilities that use robotics and informatics to generate high-throughput screens of DNA, RNA, protein, tissue, chemicals and other substances.</p>

<p>The genomics platform uses cancer genome sequencing and other high-throughput techniques to identify genes critical to the development of cancer and anomalies in the genomic profile of the tumours.</p>

<p>For more info visit : http://oicr.on.ca/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/37590/parallel-processing-with-perl</guid>
	<pubDate>Sat, 25 Aug 2018 11:32:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/37590/parallel-processing-with-perl</link>
	<title><![CDATA[Parallel Processing with Perl !]]></title>
	<description><![CDATA[<p>Here is a small tutorial on how to make best use of multiple processors for bioinformatics analysis. One best way is using perl threads and forks. Knowing how these threads and forks work is very important before implementing them. Getting to know how these work would be really useful before reading this tutorial.</p><p>Many times in bioinformatics we need to deal with huge datasets which&nbsp; are more than 100GB size. The traditional way to analysis a file is using the while loop</p><p>while (FILE){</p><p>Do something;</p><p>}</p><p>This is very slow(since we are using only one processor) and if we have 500 million lines in the dataset it takes more than a day to iterate through the whole dataset. So how do we make best use of all our processors and get the work done quickly?</p><p>Here is a very simple and efficient technique with perl which i have been using. I am&nbsp; more inclined towards using perl fork than perl threads.</p><p>One of the oldest way to fork is</p><blockquote><p>my $fork = fork();<br />if($fork){&nbsp;&nbsp;&nbsp;<br />push (@childs,$fork);&nbsp;<br />}<br />elseif($fork==0){<br /><strong>your code here;</strong><br />exit(0);<br />}<br />else{die &ldquo;Couldnt fork : $!&rdquo;;}</p><p>## wait for the child process to finish<br />foreach(@childs){<br />my $tmp=waitid($_,0);<br />}</p></blockquote><p>what a fork does is it creates a child process and takes the variables and code with it to analyze it separately (detached from the parent process) and thus a separate process is created( which usually runs on a separate processor). Thats it!! One big disadvantage of forking is its very difficult to share variables among the different processes. I will show you how to do it easily but still it has its own drawbacks.</p><blockquote><p>Okie, now if you really do not want to use fork in your code, that&rsquo;s okie too..There are many useful modules which do it for you very efficiently. One really useful module is Parallel::ForkManager. You can use Parallel::ForkManager to manage the number of forks you want to generate (number of processors you want to use).</p><p><strong>Simple usage:</strong><br />use Parallel::ForkManager;<br />my $max_processors=8;<br />my $fork= new Parallel::ForkManager($max_processors);<br />foreach (@dna) {<br />$fork-&gt;start and next; # do the fork<br /><strong>you code here;</strong><br />$fork-&gt;finish; # do the exit in the child process<br />}<br />$pm-&gt;wait_all_children;</p></blockquote><p>so you will be generating 8 forks which do the same thing for your each element of array. when one child finishes, Parallel::ForkManager generates a new one and thus you will be using all your processors to analyze the data. Now, if you have generated 8 child processes and want to write the data to one file. You need to lock the file to do this, because you will have problems with the buffering. You can lock the file using flock command.</p><blockquote><p>open (my $QUAL, &ldquo;myfile.txt&rdquo;);<br />flock $QUAL, LOCK_EX or die &ldquo;cant lock file $!&rdquo;;<br />print $QUAL &ldquo;$output&rdquo;;<br />flock $QUAL, LOCK_UN or die &ldquo;$!&rdquo;;<br />close $QUAL;</p></blockquote><p>I would not suggest using flock when dealing with multiple processes because it will decrease the processing efficiency( each child process must wait for the lock to be released by the other child process). Instead, I would suggest each fork writing to a separate file and after the processing just concatenating them.</p><p><strong>Putting it all together, If you have 100GB data you can do this</strong></p><blockquote><p><strong>step 1</strong>&nbsp;: split the dataset equally according to number of processors you have. this may take a few hours(about 2-3 hrs for 100GB file)<br />You can use unix &ldquo;split&rdquo; command for this<br />for example:<br />my $number_split=int($number_of_entries_in_your_dataset/$max_processors);<br />my $split_Files=`split -l $number_split &ldquo;your_file.fasta&rdquo; &ldquo;file_name&rdquo;`;</p><p><strong>step2</strong>: open you directory comtaining you split files and start Parallel::ForkManager.<br /><strong>For example:</strong><br />opendir(DIRECTORY, $split_files_directory) or die $!; ### open the directory<br />my $fork= new Parallel::ForkManager($max_processors);<br />while (my $file = readdir(DIRECTORY)) { ### read the directory<br />if($file=~/^\./){next;}<br />print $file,&rdquo;\n&rdquo;;<br />########## Start fork ##########<br />my $pid= $super_fork-&gt;start and next;<br /><strong>Whatever you want to do with the split file ;</strong><br /><strong>analyze my piece of $file;</strong><br />######### end fork ###############<br />$super_fork-&gt;finish;<br />}<br />$super_fork-&gt;wait_all_children;</p></blockquote><p>So basically each processor will be active with its piece of data (split file) and thus you have created 8 processes at one time which run without interfering with the other process. I again will not suggest writing output from each child process to one file(for reasons above). Write output from each fork to a separate file and finally concatenate them. Thats it, you have just increased your program speed by 8 times!! Isnt it easy?</p><p><strong>Note:</strong><br />You may worry about concatenation of the output each child generates, since it does take some time(remember 100GB). I think now you can use a mysql database LOAD DATA LOCAL INFILE command to load all the files into a single table(Should take about 3hrs for 100Gb dataset) and then export the whole table into one file. This should be faster than just concatenating them using &ldquo;cat&rdquo; command.(correct me if I am wrong)</p><p>Or much simpler way is to use pipes</p><p>cat output_dir/* | my_pipe or my_pipe &lt;(file1) final_file;</p><p>Thats it guys!! Enjoy programming and please do comment. I am not a computer scientist so forgive me for any mistakes and if any please report them. Thank you.</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/4551/au-kbc-lab</guid>
  <pubDate>Sun, 15 Sep 2013 09:33:59 -0500</pubDate>
  <link></link>
  <title><![CDATA[AU-KBC Lab]]></title>
  <description><![CDATA[
<p>Conducting Clinical Trial Management Course combined with the Apollo Hospitals. Major Research in bioinformatics as Drug Discovery, Functional Genomics, Comparative genomics, Data Mining </p>

<p>More @ http://www.au-kbc.org/</p>
]]></description>
</item>

</channel>
</rss>