<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/29601?offset=440</link>
	<atom:link href="https://bioinformaticsonline.com/related/29601?offset=440" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/28051/convert-ensembl-gtf-to-annotation-table-geneid-genesymbol-genewisechrlocation-geneclass-strand-raw</guid>
	<pubDate>Fri, 24 Jun 2016 18:08:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/28051/convert-ensembl-gtf-to-annotation-table-geneid-genesymbol-genewisechrlocation-geneclass-strand-raw</link>
	<title><![CDATA[Convert EnsEMBL GTF to Annotation table (Geneid, GeneSymbol, GeneWiseChrLocation, GeneClass, Strand) Raw]]></title>
	<description><![CDATA[<p><strong>Bash Script source:</strong></p><p>https://gist.github.com/santhilalsubhash/367befcf5216be4b1fd9</p><p>&nbsp;</p><p><strong>Information</strong>:</p><p>This script converts EnsEMBL GTF (Ex:&nbsp;<a href="https://gist.githubusercontent.com/santhilalsubhash/1e7cca357e52a181dc25/raw/cfb803e07900a2baefbb6534f1299fd30cb57a29/sample.GTF">https://gist.githubusercontent.com/santhilalsubhash/1e7cca357e52a181dc25/raw/cfb803e07900a2baefbb6534f1299fd30cb57a29/sample.GTF</a>) file to annotation table format. It generated two files<br />1) Transcript wise chromosome location with information about transcripts (Ex:&nbsp;<a href="https://gist.githubusercontent.com/santhilalsubhash/c7dec516e0338503a4b6/raw/de0af1a39f0005c4ce7321c5ae57fc8b4a14c7f4/sample.GTF_enst_annotation.txt">https://gist.githubusercontent.com/santhilalsubhash/c7dec516e0338503a4b6/raw/de0af1a39f0005c4ce7321c5ae57fc8b4a14c7f4/sample.GTF_enst_annotation.txt</a>)<br />2) Gene wise chromosome location with information about genes (Ex:&nbsp;<a href="https://gist.githubusercontent.com/santhilalsubhash/c92006c5080f0333bec2/raw/d16e0b2440d73b09b486d3c9751cdb248a73fa0b/sample.GTF_ensg_annotation.txt">https://gist.githubusercontent.com/santhilalsubhash/c92006c5080f0333bec2/raw/d16e0b2440d73b09b486d3c9751cdb248a73fa0b/sample.GTF_ensg_annotation.txt</a>)</p><p>Note: You can download GTF files from&nbsp;<a href="http://www.ensembl.org/info/data/ftp/index.html">http://www.ensembl.org/info/data/ftp/index.html</a></p>]]></description>
	<dc:creator>EagleEye</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/13025/the-5-reasons-to-mistakes-at-bioinformatics-work</guid>
	<pubDate>Thu, 24 Jul 2014 02:51:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/13025/the-5-reasons-to-mistakes-at-bioinformatics-work</link>
	<title><![CDATA[The 5 reasons to mistakes at bioinformatics work !!!]]></title>
	<description><![CDATA[<p>When you're just starting out with biological programming, it's easy to run into complex problems that make you wonder how anyone has ever managed to write a program. There are some problems that trip up nearly every bioinformatician--everything from getting started understanding the biological problems to dealing with program design. Some random mistakes are so prominent that even experienced biological programmers do it. The 8 years in bioinformatics and my few random observations, most of them are snarky. These reasons will always take longer than expected and compel you to postpone your project deadline.</p><p><strong>1.Stupid for biologist:</strong> Biology is so complex that it will make bioinformatician feel stupid. There are no any universal fixed rules; it can surprise you any time. So be nice to biologists who ask questions and resolve your biological puzzles. Sometime you will have no idea what the hell you were doing either.<br /><br /><strong>2.Puzzling why:</strong> Do not hesitate to ask question. Especially. at the beginning of project you will have to ask a lot of questions. Instead of puzzling it out at end check out and clear your doubt even for a single error. It may can leads to wrong conclusion.<br /><br /><strong>3.Running marathon:</strong> The most of the biological software&rsquo;s documentation is always incomplete. In other word they are no more than 95 percent complete. Sometime a single problem can halt your entire project for months. Compilation and running the pipelines in tedious because almost all are interdependent and need proper configuration. I face the same kind of problem with Evolver :( &hellip; <br /><br /><strong>4.Folders missing:</strong> The pipelines generate lots of data, and we keep them in several folders for future use. But sometime we delete them by mistake and move to recovery&hellip;<br /><br /><strong>5.Digging deeper:</strong> Digging deeper is fruitful, but some time it can be catastrophic. You may get frustrated or direction less. So keep a biologist with you for rescue &hellip;. Sometime an expert computer programmer to handle your server. Remember, the server will always go down when you need it the most.<br /><br />The most common frustrating&nbsp; common line: Why do we do this again?</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/37581/comparativegenomics-exercise2</guid>
	<pubDate>Wed, 22 Aug 2018 22:10:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/37581/comparativegenomics-exercise2</link>
	<title><![CDATA[ComparativeGenomics Exercise2]]></title>
	<description><![CDATA[<p>COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP&nbsp; @&nbsp;cbs.dtu.dk</p><p>Free Bioinformatics workbench https://www.mn.uio.no/ifi/english/research/networks/clsi/earlier_seminars/2012/tammivesth_osloseminarfinal.pdf</p>]]></description>
	<dc:creator>Neel</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/37581" length="139956" type="application/pdf" />
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/13337/phd-opportunity-at-universite-de-liege-belgium</guid>
  <pubDate>Sat, 02 Aug 2014 01:12:43 -0500</pubDate>
  <link></link>
  <title><![CDATA[PhD opportunity at Université de Liège - Belgium]]></title>
  <description><![CDATA[
<p>PhD opportunity at Université de Liège - Belgium</p>

<p>The Bioinformatics and Systems Biology Unit of Université de Liège (Belgium) is looking for a highly motivated master student with programming skills for a PhD thesis project (4 years, fully funded) with the goal of designing computational tools that use literature, genomic and structural data in order to infer regulatory and metabolic networks.  </p>

<p>Applicants are invited to send their resume and a recommendation letter to Prof. Patrick Meyer (more details at   www.biosys.ulg.ac.be )</p>

<p>For more information : www.biosys.ulg.ac.be</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</guid>
	<pubDate>Fri, 19 Oct 2018 07:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</link>
	<title><![CDATA[BASE: a practical de novo assembler for large genomes using long NGS reads]]></title>
	<description><![CDATA[<p><span>new&nbsp;</span><em>de novo</em><span>&nbsp;assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.</span></p><p>Address of the bookmark: <a href="https://github.com/dhlbh/BASE" rel="nofollow">https://github.com/dhlbh/BASE</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/13523/megadock-40</guid>
	<pubDate>Thu, 07 Aug 2014 18:08:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/13523/megadock-40</link>
	<title><![CDATA[MEGADOCK 4.0]]></title>
	<description><![CDATA[<p>An ultra&ndash;high-performance protein&ndash;protein docking software for heterogeneous supercomputers</p>
<p id="p-4"><strong>Summary:</strong> The application of protein&ndash;protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of over 97% strong scaling.</p>
<p id="p-5"><strong>Availability and Implementation:</strong> MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: <a href="http://www.bi.cs.titech.ac.jp/megadock">http://www.bi.cs.titech.ac.jp/megadock</a>.</p>
<p id="p-6"><strong>Contact:</strong> <a href="mailto:akiyama@cs.titech.ac.jp">akiyama@cs.titech.ac.jp</a></p><p>Address of the bookmark: <a href="http://bioinformatics.oxfordjournals.org/content/early/2014/08/06/bioinformatics.btu532.short" rel="nofollow">http://bioinformatics.oxfordjournals.org/content/early/2014/08/06/bioinformatics.btu532.short</a></p>]]></description>
	<dc:creator>Suleman Khan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/38618/canu-genome-assembly-parameters</guid>
	<pubDate>Mon, 07 Jan 2019 08:40:37 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/38618/canu-genome-assembly-parameters</link>
	<title><![CDATA[CANU genome assembly parameters !]]></title>
	<description><![CDATA[<p>Choose the appropriate parameters to run Canu and run it. The assembly will take about an hour. You can use two cores (parameter&nbsp;<code>-maxThreads=2</code>) and you would like to disable cluster option, since we compute on a single Amazon server set off the option to compute on cluster&nbsp;<code>useGrid=false</code>. This specifications should be for your project discussed with a local computing guru. The parameters that are in square brackets&nbsp;<code>[]</code>&nbsp;are optional, symbol&nbsp;<code>|</code>&nbsp;stands for "or".</p><pre><code>usage:   canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s ] \
               -p  \
               -d  \
               genomeSize=[g|m|k] \
               -maxThreads=2 \
               useGrid=false \
              [other-options] \
               read_file.fastq.gz
</code></pre><p>A default&nbsp;<code>Canu</code>&nbsp;run produces usually high quality assembly, example of a command that was used for testing can be found below. However, there are still a lot of parameters that are possible to tweak. For example if we desire to assemble haplotypes separately of if we want to smash them together, we can alternate the error correction process.</p><pre><code>canu -p test_asmbl \
     -d asm_test3 \
     genomeSize=2m \
     -maxThreads=2 useGrid=false \
     -pacbio-raw \ ~/pacbio/dna/sample_reads.fastq.gz</code></pre><p>There is a brilliant&nbsp;<a href="http://canu.readthedocs.io/en/latest/faq.html#what-parameters-can-i-tweak">section in documentation</a>&nbsp;about parameter tweaking.</p><p>The output directory contains will contain many files. The most interesting ones are:</p><ul>
<li><code>*.correctedReads.fasta.gz</code>&nbsp;: file containing the input sequences after correction, trim and split based on consensus evidence.</li>
<li><code>*.trimmedReads.fastq</code>&nbsp;: file containing the sequences after correction and final trimming</li>
<li><code>*.layout</code>&nbsp;: file containing informations about read inclusion in the final assembly</li>
<li><code>*.gfa</code>&nbsp;: file containing the assembly graph by Canu</li>
<li><code>*.contigs.fasta</code>&nbsp;: file containing everything that could be assembled and is part of the primary assembly</li>
</ul><p>The basic stats of assembly can be read from reports generated by the assembler, or calculated using standard UNIX command line tools.</p><p>More at&nbsp;https://canu.readthedocs.io/en/latest/faq.html</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39674/simka-and-simkamin-are-comparative-metagenomics-method-dedicated-to-ngs-datasets</guid>
	<pubDate>Sat, 06 Jul 2019 13:56:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39674/simka-and-simkamin-are-comparative-metagenomics-method-dedicated-to-ngs-datasets</link>
	<title><![CDATA[Simka and SimkaMin are comparative metagenomics method dedicated to NGS datasets]]></title>
	<description><![CDATA[<p>Simka is a de novo comparative metagenomics tool. Simka represents each dataset as a k-mer spectrum and compute several classical ecological distances between them.</p>
<p>Developper:&nbsp;<a href="http://people.rennes.inria.fr/Gaetan.Benoit/">Ga&euml;tan Benoit</a>, PhD, former member of the&nbsp;<a href="http://team.inria.fr/genscale/">Genscale</a>&nbsp;team at Inria.</p>
<p>Contact: claire dot lemaitre at inria dot fr</p>
<p><span>Simka and SimkaMin are comparative metagenomics method dedicated to NGS datasets.&nbsp;</span><span></span><span><a href="https://gatb.inria.fr/software/simka/">https://gatb.inria.fr/software/simka/</a></span></p><p>Address of the bookmark: <a href="https://github.com/GATB/simka" rel="nofollow">https://github.com/GATB/simka</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/14186/pybedtools</guid>
	<pubDate>Wed, 20 Aug 2014 01:03:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/14186/pybedtools</link>
	<title><![CDATA[pybedtools]]></title>
	<description><![CDATA[<p>pybedtools is a Python wrapper for Aaron Quinlan's BEDtools programs (https://github.com/arq5x/bedtools), which are widely used for genomic interval manipulation or "genome algebra". pybedtools extends BEDTools by offering feature-level manipulations from with Python. See full online documentation, including installation instructions, at http://pythonhosted.org/pybedtools/.</p><p>More at http://pythonhosted.org/pybedtools/</p><p>A powerful toolset for genome arithmetic.http://code.google.com/p/bedtools/</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40604/gapfinisher-a-reliable-gap-filling-pipeline-for-sspace-longread-scaffolder-output</guid>
	<pubDate>Fri, 24 Jan 2020 06:04:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40604/gapfinisher-a-reliable-gap-filling-pipeline-for-sspace-longread-scaffolder-output</link>
	<title><![CDATA[gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output]]></title>
	<description><![CDATA[<p><span>gapFinisher is based on the controlled use of a previously published gap filling tool FGAP and works on all standard Linux/UNIX command lines. They compare the performance of gapFinisher against two other published gap filling tools PBJelly and GMcloser. </span></p>
<p><span>gapFinisher can fill gaps in draft genomes quickly and reliably.</span></p><p>Address of the bookmark: <a href="https://github.com/kammoji/gapFinisher" rel="nofollow">https://github.com/kammoji/gapFinisher</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>