<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/27216?offset=710</link>
	<atom:link href="https://bioinformaticsonline.com/related/27216?offset=710" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/22388/perl-one-liner-basics</guid>
	<pubDate>Sun, 24 May 2015 09:28:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/22388/perl-one-liner-basics</link>
	<title><![CDATA[Perl One liner basics !!]]></title>
	<description><![CDATA[<p>Perl has a ton of command line switches (see perldoc perlrun), but I'm just going to cover the ones you'll commonly need to debug code. The most important switch is -e, for execute (or maybe "engage" :) ). The -e switch takes a quoted string of Perl code and executes it. For example:<br /><br />$ perl -e 'print "Hello, World!\n"'<br />Hello, World!<br /><br />It's important that you use single-quotes to quote the code for -e. This usually means you can't use single-quotes within the one liner code. If you're using Windows cmd.exe or PowerShell, you must use double-quotes instead.<br /><br />I'm always forgetting what Perl's predefined special variables do, and often test them at the command line with a one liner to see what they contain. For instance do you remember what $^O is?<br /><br />$ perl -e 'print "$^O\n"'<br />linux<br /><br />It's the operating system name. With that cleared up, let's see what else we can do. If you're using a relatively new Perl (5.10.0 or higher) you can use the -E switch instead of -e. This turns on some of Perl's newer features, like say, which prints a string and appends a newline to it. This saves typing and makes the code cleaner:<br /><br />$ perl -E 'say "$^O"'<br />linux<br /><br />Pretty handy! say is a nifty feature that you'll use again and again.</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40604/gapfinisher-a-reliable-gap-filling-pipeline-for-sspace-longread-scaffolder-output</guid>
	<pubDate>Fri, 24 Jan 2020 06:04:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40604/gapfinisher-a-reliable-gap-filling-pipeline-for-sspace-longread-scaffolder-output</link>
	<title><![CDATA[gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output]]></title>
	<description><![CDATA[<p><span>gapFinisher is based on the controlled use of a previously published gap filling tool FGAP and works on all standard Linux/UNIX command lines. They compare the performance of gapFinisher against two other published gap filling tools PBJelly and GMcloser. </span></p>
<p><span>gapFinisher can fill gaps in draft genomes quickly and reliably.</span></p><p>Address of the bookmark: <a href="https://github.com/kammoji/gapFinisher" rel="nofollow">https://github.com/kammoji/gapFinisher</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41730/parliament2-runs-a-combination-of-tools-to-generate-structural-variant-calls-on-whole-genome-sequencing-data</guid>
	<pubDate>Thu, 28 May 2020 21:57:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41730/parliament2-runs-a-combination-of-tools-to-generate-structural-variant-calls-on-whole-genome-sequencing-data</link>
	<title><![CDATA[Parliament2: Runs a combination of tools to generate structural variant calls on whole-genome sequencing data]]></title>
	<description><![CDATA[<p>Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.</p>
<p>Parliament2 runs a combination of tools to generate structural variant calls on whole-genome sequencing data. It can run the following callers: Breakdancer, Breakseq2, CNVnator, Delly2, Manta, and Lumpy. Because of synergies in how the programs use computational resources, these are all run in parallel. Parliament2 will produce the outputs of each of the tools for subsequent investigation.</p><p>Address of the bookmark: <a href="https://github.com/dnanexus/parliament2" rel="nofollow">https://github.com/dnanexus/parliament2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/22416/rosenberg-lab</guid>
  <pubDate>Wed, 27 May 2015 17:52:24 -0500</pubDate>
  <link></link>
  <title><![CDATA[Rosenberg lab]]></title>
  <description><![CDATA[
<p>Research. Research in the lab focuses on mathematical, statistical, and computational problems in evolutionary biology and human genetics. Long-term interests of the lab include topics such as:</p>

<p>    Human genetic variation<br />    Inference of human evolutionary history from genetic markers<br />    Statistical analysis of population-genetic data<br />    Mathematical models of gene genealogies<br />    Theoretical population genetics<br />    Combinatorics of evolutionary trees<br />    The relationship between gene trees and species trees<br />    The role of human evolutionary genetics in the search for genes that contribute to disease-susceptibility <br />More at https://web.stanford.edu/group/rosenberglab/index.html</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/43227/project-associate-i-project-associate-ii-senior-project-associate-igib</guid>
  <pubDate>Thu, 05 Aug 2021 16:11:32 -0500</pubDate>
  <link></link>
  <title><![CDATA[Project Associate-I | Project Associate-II | Senior Project Associate @ IGIB]]></title>
  <description><![CDATA[
<p>Experience in Next Generation Sequencing (NGS) application and interest in Genomics/ Clinical / Translational Applications. OR Good computational programming skills and deep interest in working on interface of Genomics and Clinical application. </p>

<p>Project Scientist-I <br />Experimental / Computation analysis experience in highthroughput genomics/ clinical application.</p>

<p>Project Manager <br />Experience in handling large biological projects involving high-throughput genomics/ clinical application.</p>

<p>Scientific Administrative Assistant <br />Lab Work. </p>

<p>More at https://vinodscaria.genomes.in/positionsopen</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/22437/jrf-bioinformatics-icar-national-research-centre-for-orchids-pakyong</guid>
  <pubDate>Thu, 28 May 2015 19:33:19 -0500</pubDate>
  <link></link>
  <title><![CDATA[JRF Bioinformatics @ ICAR - National Research Centre for Orchids  Pakyong]]></title>
  <description><![CDATA[
<p>ICAR - National Research Centre for Orchids</p>

<p>Pakyong</p>

<p>F.No:NRCO/Admn/DBT /136 /</p>

<p>Walk-in-Interviews will be held at 737106, Sikkim for the post of 01 (One Project ‘DBT’s Twinning programme for the NE’ titled “Assessment of some fragrant orchids of north-east India for sustainable improvement of community livelihood”, indicated below. The appointment will be on contractual basis and the incumbents shall not have any regular appointment in ICAR.</p>

<p>‘DBT’s Twinning programme for the NE’ titled “Assessment of chemical and genetic divergence of some fragrant orchids of north-east India for sustainable improvement of community livelihood”</p>

<p>Junior Research Fellow (One post)</p>

<p>Essential Qualification : a. MSc (with NET qualification) / M.Tech degree (with or without NET) with minimum 55% marks in Biotechnology/ Bioinformatics/ Molecular Biology or any other related field.</p>

<p>Desirable Qualification: Computer Skills (Linux, Perl, Java, MySQL) with experience in advanced molecular Biology techniques</p>

<p>2nd June 2015</p>

<p>Advertisement: www.nrcorchids.nic.in/Employments/Vacancy%20-%20JRF.pdf</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/22570/frequent-words-problem-solution-by-perl</guid>
	<pubDate>Tue, 09 Jun 2015 23:38:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/22570/frequent-words-problem-solution-by-perl</link>
	<title><![CDATA[Frequent words problem solution by Perl]]></title>
	<description><![CDATA[<div><p>Solved with perl <a href="http://rosalind.info/problems/1a/">http://rosalind.info/problems/1a/</a></p><p>#Find the most frequent k-mers in a string.<br />#Given: A DNA string Text and an integer k.<br />#Return: All most frequent k-mers in Text (in any order).<br /><br />use strict;<br />use warnings;<br /><br />my $string="ACGTTGCATGTCGCATGATGCATGAGAGCT";<br />my $kmer=4; <br />my %myHash;<br />my $max=0;<br /><br />for (my $aa=0; $aa&lt;=(length($string)-4); $aa++) {<br />&nbsp;&nbsp; &nbsp;my $myStr=substr&nbsp; $string, $aa,$kmer;<br />&nbsp;&nbsp; &nbsp;#print "$myStr\n";<br />&nbsp;&nbsp; &nbsp;my $km=kmerMatch ($string, $myStr, $kmer);<br />&nbsp;&nbsp; &nbsp;if ($km &gt; $max) { $max = $km;}<br />&nbsp;&nbsp; &nbsp;#print "$km\t$myStr\n";<br />&nbsp;&nbsp; &nbsp;$myHash{$myStr}=$km;<br />&nbsp;&nbsp; &nbsp;<br />}<br /><br />#Print all key which have matching values<br />foreach my $name (keys %myHash){<br />&nbsp;&nbsp;&nbsp; print "$name " if $myHash{$name} == $max;<br />}<br /><br />sub kmerMatch { #Check the exact matching kmers with sliding window<br />my ($string, $myStr, $kmer)=@_;<br />my $count=0;<br />for (my $aa=0; $aa&lt;=(length($string)-4); $aa++) {<br />&nbsp;&nbsp; &nbsp;my $myWin=substr&nbsp; $string, $aa,$kmer;<br />&nbsp;&nbsp; &nbsp;if ($myWin eq $myStr) {<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;#print "$myWin eq $myStr\n";<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;$count++;<br />&nbsp;&nbsp; &nbsp;}<br />}<br />return $count;<br />}</p></div>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38199/pacasus-correction-of-palindromes-in-long-reads-from-pacbio-and-nanopore</guid>
	<pubDate>Mon, 12 Nov 2018 05:26:48 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38199/pacasus-correction-of-palindromes-in-long-reads-from-pacbio-and-nanopore</link>
	<title><![CDATA[Pacasus: Correction of palindromes in long reads from PacBio and Nanopore]]></title>
	<description><![CDATA[<p><br>Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification. Check the poster from the Revolutionizing Next-Generation Sequencing (2nd edition) conference in the source folder:&nbsp;<a href="https://github.com/swarris/Pacasus/blob/master/vib2017.pdf">https://github.com/swarris/Pacasus/blob/master/vib2017.pdf</a>.</p>
<p>The prepint version is found on&nbsp;<a href="http://www.biorxiv.org/content/early/2017/08/09/173872">http://www.biorxiv.org/content/early/2017/08/09/173872</a></p>
<p>It uses the pyPaSWAS framework for sequence alignment (<a href="https://github.com/swarris/pyPaSWAS">https://github.com/swarris/pyPaSWAS</a>)</p><p>Address of the bookmark: <a href="https://github.com/swarris/Pacasus" rel="nofollow">https://github.com/swarris/Pacasus</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/22572/clump-finding-problem-solved-with-perl</guid>
	<pubDate>Wed, 10 Jun 2015 00:17:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/22572/clump-finding-problem-solved-with-perl</link>
	<title><![CDATA[Clump Finding Problem Solved with Perl]]></title>
	<description><![CDATA[<p>The question at http://rosalind.info/problems/1d/</p><p>Script are moved to&nbsp;http://bioinformaticsonline.com/snippets/view/34633/clump-finding-problem-solved-with-perl</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>