<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37590?offset=1080</link>
	<atom:link href="https://bioinformaticsonline.com/related/37590?offset=1080" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</guid>
	<pubDate>Tue, 10 Apr 2018 04:13:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</link>
	<title><![CDATA[Bioinformatics OneLiner]]></title>
	<description><![CDATA[<p>To remove all line ends (\n) from a Unix text file:</p><pre>sed ':a;N;$!ba;s/\n//g' filename.txt &gt; newfilename_oneline.txt</pre><p>To get average for a column of numbers (here the second column $2):</p><pre>awk '{ sum += $2; n++ } END { if (n &gt; 0) print sum / n; }'</pre><p>To get sequence length for all sequences in a fasta file:</p><pre>awk '/^&gt;/ {if (seqlen){print seqlen}; print ;seqlen=0;next; } { seqlen = seqlen +length($0)}END{print seqlen}' \<br />filename.fasta</pre><p>To copy (move, rename, etc) files based on their list in a text file:</p><pre>cat file_list.txt | while read line; do cp "$line" complete_dataset/"$line"; done</pre><p>To split bam files into sets with mapped and unmapped reads:</p><pre>samtools view -F4 sample.bam &gt; sample.mapped.sam<br />samtools view -f4 sample.bam &gt; sample.unmapped.sam</pre><p>To gzip all your fastq files using gnu parallel and gzip:</p><pre>parallel gzip ::: *.fastq</pre><p>To gzip all your fastq files using pigz:</p><pre>pigz *.fastq</pre><p>To count all sequences in a fasta file:</p><pre>grep "^&gt;" yourfile.fasta -c</pre><p>To count all sequences in all fasta files in your current directory:</p><pre>for a in *.fasta; do ls $a; grep "^&gt;" -c $a; done</pre><p>To keep only one copy of duplicated lines:</p><pre>awk '!seen[$0]++'</pre><p>To sum assembly size from SPAdes contigs.fasta or scaffolds.fasta file:</p><pre>grep "^&gt;" scaffolds.fasta | cut -f 4 -d '_' | paste -sd+ | bc</pre><p>To remove everything after the first space at each line, e.g. to to simplify fasta headers:</p><pre>cut -d' ' -f1 &lt; your_file</pre><p>To count reads in a all .fastq.gz files in your current folder (fast, using gnu parallel):</p><pre>parallel "echo {} &amp;&amp; gunzip -c {} | wc -l | awk '{d=\$1; print d/4;}'" ::: *.gz</pre><p>To count reads in a all .fastq.gz files in your current folder:</p><pre>zcat *.gz | echo $((`wc -l`/4))</pre><p>To count reads in a all .fastq files in your current folder:</p><pre>cat *.fastq | echo $((`wc -l`/4))</pre><p>To count base pairs in a all .fastq.gz files in your current folder:</p><pre>zcat *.fastq.gz | paste - - - - | cut -f 2 | tr -d '\n' | wc -c </pre><p>To split multifasta file into many fasta files:</p><pre>awk '/^&gt;/ {OUT=substr($0,2) ".fa"}; {print &gt;&gt; OUT; close(OUT)}' Input_File</pre><p>To convert Illumina FASTQ 1.3 to 1.8:</p><pre>sed -e '4~4y/@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghi/!"#$%&amp;'\''()*+,-.\/0123456789:;&lt;=&gt;?@ABCDEFGHIJ/' f.fastq</pre><p>To convert FASTQ to FASTA:</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' </pre><p>To get fastq read length distribution:</p><pre>cat reads.fastq | awk '{if(NR%4==2) print length($1)}' | sort | uniq -c</pre><p>To deinterleave interleaved fastq file:</p><pre>cat myf.fq | paste - - - - - - - - | tee &gt;(cut -f 1-4 | tr "\t" "\n" &gt; myfile_1.fq) | cut -f 5-8 | \<br />tr "\t" "\n" &gt; myf2.fq </pre><p>To filter and sort contig identifiers from SPAdes assembly (e.g. here lenght &gt;= 4000 + coverage &gt;=100):</p><pre>grep "^&gt;" scaffolds.fasta | sed s"/_/ /"g | awk '{ if ($4 &gt;= 4000 &amp;&amp; $6 &gt;= 100) print $0 }' | sort -k 4 -n | \<br />sed s"/ /_/"g</pre><p>To append something to all headers of your fasta files:</p><pre>sed 's/&gt;.*/&amp;YOURSTRING/' filename.fasta &gt; new_filename.fasta</pre><p>To replace/squeeze multiple adjacent spaces by only one space:&nbsp;</p><pre>tr -s " " &lt; file</pre><p>To filter fastq based on length (here larger than or equal to 21, but smaller than or equal to 25.</p><pre>cat your.fastq | paste - - - - | awk 'length($2)&nbsp; &gt;= 21 &amp;&amp; length($2) &lt;= 25' | sed 's/\t/\n/g' &gt; filtered.fastq</pre><p>To print difference between the last and first row in 5th column:</p><pre>awk '{if (!first){first=$5;}; last=$5;} END {print last-first}' myfile.txt</pre><p>To sample only 200 first bases from all sequences in a multifasta file (e.g. from assembly scaffolds.fasta file here):</p><pre>awk '/^&gt;/{ seqlen=0; print; next; } seqlen &lt; 200 { if (seqlen + length($0) &gt; 200) $0 = substr($0, 1, 200-seqlen);\<br /> seqlen += length($0); print }' scaffolds.fasta &gt; 200bp_scaffolds.fasta</pre><p>&nbsp;To pipe a compressed fasta file directly into makeblastdb.</p><pre>gunzip -c fasta.gz | makeblastdb -in -</pre><p>To remove sequences with duplicate fasta headers from a fasta file.</p><pre>awk '/^&gt;/{f=!d[$1];d[$1]=1}f' in.fasta &gt; out.fasta</pre>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/2337/clinical-genomics-informatics-europe-at-lisbon-portugal</guid>
  <pubDate>Wed, 14 Aug 2013 09:58:34 -0500</pubDate>
  <link></link>
  <title><![CDATA[Clinical Genomics &amp; Informatics Europe at Lisbon, Portugal]]></title>
  <description><![CDATA[
<p>Bio-IT World and Cambridge Healthtech Institute's fifth international Clinical Genomics &amp; Informatics Europe conference will feature four main tracks on Clinical Exome Sequencing, High Scale Computing, Genome Informatics, and RNA-Seq and Transcriptome Analysis, as well as two pre-conference symposia on Clinical Epigenetics and Quantitative Digital Detection Technologies. The conference will tackle the huge amounts of sequencing data produced by new technologies that have introduced significant challenges for bioinformatics, both in terms of the analysis and interpretation of data and clinical implementation of novel variants. Members of the international community will come together to look at the science and informatics required to utilize next generation sequencing for the molecular diagnosis of complex diseases.</p>

<p>Dated : 04 Dec 2013 - 06 Dec 2013</p>

<p>More at : http://www.clinicalgenomicsinformatics.com/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</guid>
	<pubDate>Wed, 25 Apr 2018 04:35:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</link>
	<title><![CDATA[Binding Site Prediction in Protein !]]></title>
	<description><![CDATA[<p><span>The interaction between proteins and other molecules is fundamental to all biological functions. In this section we include tools that can assist in prediction of interaction sites on protein surface and tools for predicting the structure of the intermolecular complex formed between two or more molecules (docking).</span></p><h4>Pockets Identification</h4><p><a href="http://sts.bioengr.uic.edu/castp/" target="_blank">CASTp</a></p><div style="text-align: justify;">Automatic Identification of pockets and cavities in proteins structure, and quantitation of their volumes using Delaunay triangulation. Available also as PyMOL plugin</div><p><a href="http://www.bioinformatics.leeds.ac.uk/pocketfinder/" target="_blank">Pocket-Finder</a></p><div style="text-align: justify;">Automatic identification of pockets and cavities in proteins structure, and quantitation of their volumes.</div><p><a href="http://gecco.org.chemie.uni-frankfurt.de/pocketpicker/index.html" target="_blank">PocketPicker</a></p><div style="text-align: justify;">Grid-based technique for the analysis of protein pockets. PocketPicker available as a plugin for&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/pymol.htm">PyMOL</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><h4>Binding Site Prediction</h4>
<p><a href="http://consurf.tau.ac.il/" target="_blank">ConSurf</a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification of functional regions in proteins by surface-mapping of phylogenetic information</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www-cryst.bioc.cam.ac.uk/~crescendo/crescendo.php" target="_blank">CRESCENDO</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification protein interaction sites. It uses sequence conservation patterns in homologous proteins to distinguish between residues that are conserved due to structural restraints from those due to functional restraints.&nbsp;&nbsp;</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><strong>Ligand Binding Sites</strong></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www.sbg.bio.ic.ac.uk/~3dligandsite/" target="_blank">3DLigandSite</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">The server utilizes protein-structure prediction to provide structural models of the binding site. Ligands bound to structures are superimposed onto the model and use to predict the binding site.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">F<a href="http://cssb.biology.gatech.edu/skolnick/files/FINDSITE/" target="_blank">INDSITE</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">A threading-based method for ligand-binding site prediction and functional annotation based on binding-site similarity across superimposed groups of threading templates.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">
<p><a href="http://scoppi.biotec.tu-dresden.de/pocket/" target="_blank">LIGSITE<sup>csc</sup></a></p>
<div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Prediction of binding site by pocket identification using the Connolly surface and degree of conservation</div>
<p><a href="http://metapocket.eml.org/" target="_blank"></a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://metapocket.eml.org/" target="_blank">metaPocket</a>A meta server for ligand-binding site prediction. metaPocket use&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#ligsite">LIGSITE<sup>csc</sup></a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#pass">PASS</a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#qsite">Q-SiteFinder</a>&nbsp;and&nbsp;<a href="http://www.biochem.ucl.ac.uk/~roman/surfnet/surfnet.html" target="_blank">SURFNET</a></div>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/2423/cancers-origins-revealed</guid>
	<pubDate>Thu, 15 Aug 2013 13:06:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/2423/cancers-origins-revealed</link>
	<title><![CDATA[Cancer's origins revealed]]></title>
	<description><![CDATA[<p>Researchers have provided the first comprehensive compendium of mutational processes that drive tumour development. Together, these mutational processes explain most mutations found in 30 of the most common cancer types. This new understanding of cancer development could help to treat and prevent a wide-range of cancers.<br /><br />More at &gt;&gt; http://www.sanger.ac.uk/about/press/2013/130814.html</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/37610/applied-statistics-for-bioinformatics-using-r</guid>
	<pubDate>Thu, 30 Aug 2018 03:45:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/37610/applied-statistics-for-bioinformatics-using-r</link>
	<title><![CDATA[Applied Statistics for Bioinformatics using R]]></title>
	<description><![CDATA[<p>The purpose of this book is to give an introduction into statistics in order to solve some problems of bioinformatics. Statistics provides procedures to explore and visualize data as well as to test biological hypotheses. The book intends to be introductory in explaining and programming elementary statistical concepts, thereby bridging the gap between high school levels and the specialized statistical literature</p>]]></description>
	<dc:creator>Neel</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/37610" length="1368378" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/2464/computer-theory-genetics-george-chao-at-tedxumnsalon</guid>
	<pubDate>Thu, 15 Aug 2013 22:08:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/2464/computer-theory-genetics-george-chao-at-tedxumnsalon</link>
	<title><![CDATA[Computer Theory & Genetics: George Chao at TEDxUMNSalon]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/7_GL17oiak8" frameborder="0" allowfullscreen></iframe>George Chao is an undergraduate senior studying Genetics and Computer Science at the University of Minnesota. Having started genetics research as soon as he entered the university, he has worked in labs spanning multiple disciplines as well as in Japan. Some of these researches include developmental genetics in Drosophila, computational techniques for analyzing protein interactions, and helping with the development of algorithms to analyze motion capture data of patients with neck pain. During this time, George steadily developed a fascination with the field of bioinformatics, the study of using computational techniques to learn from genetic data. He would like to go into a career of research into the application of bioinformatics in various fields.

----

The individuals involved with TEDxUMN have a passion for bringing together the great thinkers at the University of Minnesota and giving them the opportunity to share their ideas worth spreading and to discuss our shared future. We provide these great people the opportunity to share these ideas on a global stage and with an incredibly diverse audience. We believe in the power of ideas to change attitudes, lives and ultimately the world.

Check out TEDxUMN at http://www.TEDxUMN.com/

In the spirit of ideas worth spreading, TEDx is a program of local, self-organized events that bring people together to share a TED-like experience. At a TEDx event, TEDTalks video and live speakers combine to spark deep discussion and connection in a small group. These local, self-organized events are branded TEDx, where x = independently organized TED event. The TED Conference provides general guidance for the TEDx program, but individual TEDx events are self-organized.* (*Subject to certain rules and regulations)]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38487/betsy-a-new-backward-chaining-expert-system-for-automated-development-of-pipelines-in-bioinformatics</guid>
	<pubDate>Mon, 17 Dec 2018 18:46:51 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38487/betsy-a-new-backward-chaining-expert-system-for-automated-development-of-pipelines-in-bioinformatics</link>
	<title><![CDATA[BETSY: A new backward-chaining expert system for automated development of pipelines in Bioinformatics]]></title>
	<description><![CDATA[<p>The BETSY provides a command-line interface and available at&nbsp;<a href="https://github.com/jefftc/changlab">https://github.com/jefftc/changlab</a>. A user first searches in the knowledge base for desired output and then BETSY develops an initial workflow to produce that data which is later examined by the user. The user can optimize the parameters, the algorithm to preprocess the data, and normalize it depending on the task.</p>
<p>Currently, BETSY consists of modules required for the microarray and next-generation sequencing data [4] such as expression analysis, classification, peak calling, and visualization.</p><p>Address of the bookmark: <a href="https://github.com/jefftc/changlab" rel="nofollow">https://github.com/jefftc/changlab</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/2680/4-positions-in-high-throughput-computational-metagenomics-and-systems-biology-of-natural-products</guid>
  <pubDate>Tue, 20 Aug 2013 08:42:29 -0500</pubDate>
  <link></link>
  <title><![CDATA[4 positions in high throughput computational metagenomics and systems biology of natural products]]></title>
  <description><![CDATA[
<p>The Research and Innovation Centre at the Fondazione Edmund Mach (CRI-FEM) is a major international research institution with strong and expanding research interests in Fruit Genomics, Quality Health and Nutrition of Agricultural Products, Agro-ecosystems Sustainability, Biodiversity and Molecular Ecology.</p>

<p>CRI-FEM hosts GMPF, an International PhD Program in Genomics and Molecular Physiology of Fruit Crops and Fox-Lab, an international initiative in forest and wood research.<br />4 positions in high throughput computational metagenomics and systems biology of natural products - deadline September 30th, 2013</p>

<p>To support interdisciplinary research, CRI-FEM has established the Computational Biology Centre (CBC).</p>

<p>The mission of CBC is to develop systems-level integrative approaches connecting genotype to phenotype with a special focus on genome-wide analyses and next generation sequencing technologies. </p>

<p>CRI-FEM is seeking to attract 4 high calibre scientists in the areas of high throughput computational metagenomics and systems biology of natural products.</p>

<p>Here below the list of the 4 positions:</p>

<p>http://www.fmach.it/eng/Servizi-Generali/Lavora-con-noi/Annunci-lavoro-e-borse-di-studio/Details-of-the-5-positions-in-high-throughput-computational-metagenomics-and-systems-biology-of-natural-products-deadline-September-30th-2013/Post-doc-in-Metagenomics-screening-and-characterization-of-bioactive-microbial-compounds-130_CRI_MSC</p>

<p>http://www.fmach.it/eng/Servizi-Generali/Lavora-con-noi/Annunci-lavoro-e-borse-di-studio/Details-of-the-5-positions-in-high-throughput-computational-metagenomics-and-systems-biology-of-natural-products-deadline-September-30th-2013/Post-doc-in-Modeling-transcriptional-control-programs-at-a-genome-wide-scale-131_CRI_TCP</p>

<p>http://www.fmach.it/eng/Servizi-Generali/Lavora-con-noi/Annunci-lavoro-e-borse-di-studio/Details-of-the-5-positions-in-high-throughput-computational-metagenomics-and-systems-biology-of-natural-products-deadline-September-30th-2013/Technologist-in-Purification-of-plant-bioactive-molecules-from-complex-matrixes-132_CRI_PBM</p>

<p>http://www.fmach.it/eng/Servizi-Generali/Lavora-con-noi/Annunci-lavoro-e-borse-di-studio/Details-of-the-5-positions-in-high-throughput-computational-metagenomics-and-systems-biology-of-natural-products-deadline-September-30th-2013/Researcher-in-Methods-for-algorithmic-and-integrative-genomics-for-metagenomics-134_CRI_AIG</p>

<p>For more information on the CBC or informal inquiries on the advertised positions please contact Dr Duccio Cavalieri (e-mail duccio.cavalieri@fmach.it).</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/39603/tenure-track-position-in-bioinformatics-at-institute-of-neurobiology-unam-queretaro-mexico</guid>
  <pubDate>Mon, 10 Jun 2019 00:48:54 -0500</pubDate>
  <link></link>
  <title><![CDATA[Tenure Track position in Bioinformatics at Institute of Neurobiology, UNAM, Querétaro, México]]></title>
  <description><![CDATA[
<p>The Institute of Neurobiology UNAM (www.inb.unam.mx) offers a tenure-track position at the level of Assistant Professor (Investigador Asociado C) to develop an original research program in Bioinformatics with applications to neuroscience and to establish multidisciplinary collaboration with other members of the Institute. Applicants are expected to have a doctorate degree, postdoctoral experience related to bioinformatics or genome biology, and a strong track record of peer-reviewed publications. No previous experience in neuroscience is required.</p>

<p>Interested applicants must submit CV and addresses of three references to ataulfo@unam.mx.</p>

<p>Tenure Track position in Genomic Sciences  </p>

<p>Laboratorio Internacional de Investigación sobre el Genoma Humano, UNAM Juriquilla, Querétaro, México </p>

<p>The International Laboratory for Human Genome Research, LIIGH-UNAM (www.liigh.unam.mx) offers a tenure-track position at the level of Assistant Professor (Investigador Asociado C) to perform research, teaching and formation of human resources in the area of: “Genomics of Mendelian Diseases” </p>

<p>Applicants are expected to have a doctorate degree, postdoctoral experience related to the above mentioned area and a strong track record of peer-reviewed publications. Interested applicants must submit CV, email addresses of three references, and a three-page project to Dr. Rafael Palacios, Coordinator of LIIGH-UNAM (palacios@liigh.unam.mx) before June 21, 2019 ………………………………………………………………</p>

<p>Tenure Track position in Genomic Sciences </p>

<p>Laboratorio Internacional de Investigación sobre el Genoma Humano, UNAM Juriquilla, Querétaro, México </p>

<p>The International Laboratory for Human Genome Research, LIIGH-UNAM (www.liigh.unam.mx) offers a tenure-track position at the level of Assistant Professor (Investigador Asociado C) to perform research, teaching and formation of human resources in the area of: “Statistic Population Genomics and its Impact in Complex Diseases” </p>

<p>Applicants are expected to have a doctorate degree, postdoctoral experience related to the above mentioned area and a strong track record of peer-reviewed publications. Interested applicants must submit CV, email addresses of three references, and a three-page statement of research interests to Dr. Rafael Palacios, Coordinator of LIIGH-UNAM (palacios@liigh.unam.mx) before June 21, 2019</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/2759/dynamic-programming-alignment</guid>
	<pubDate>Thu, 22 Aug 2013 09:38:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/2759/dynamic-programming-alignment</link>
	<title><![CDATA[Dynamic Programming Alignment]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/EWJnDMKBEv0" frameborder="0" allowfullscreen></iframe>lecture 9, Chem. C100, Spring 2013, UCLA]]></description>
	
</item>

</channel>
</rss>