<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/31564?offset=960</link>
	<atom:link href="https://bioinformaticsonline.com/related/31564?offset=960" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/13842/swabs-to-genomes-a-comprehensive-workflow</guid>
	<pubDate>Sun, 10 Aug 2014 03:01:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/13842/swabs-to-genomes-a-comprehensive-workflow</link>
	<title><![CDATA[Swabs to Genomes: A Comprehensive Workflow]]></title>
	<description><![CDATA[<p>The sequencing, assembly, and basic analysis of microbial genomes, once a painstaking and expensive undertaking, has become almost trivial for research labs with access to standard molecular biology and computational tools. However, there are a wide variety of options available for DNA library preparation and sequencing, and inexperience with bioinformatics can pose a significant barrier to entry for many who may be interested in microbial genomics. The objective of the present study was to design, test, troubleshoot, and publish a simple, comprehensive workflow from the collection of an environmental sample (a swab) to a published microbial genome; empowering even a lab or classroom with limited resources and bioinformatics experience to perform it.</p><p>Address of the bookmark: <a href="https://peerj.com/preprints/453.pdf" rel="nofollow">https://peerj.com/preprints/453.pdf</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/37411/my-commonly-used-commands-in-bioinformatics</guid>
	<pubDate>Thu, 26 Jul 2018 04:58:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/37411/my-commonly-used-commands-in-bioinformatics</link>
	<title><![CDATA[My commonly used commands in Bioinformatics]]></title>
	<description><![CDATA[<p>FYI, I've found it useful to use MUMmer to extract the specific changes that Racon makes, so I can evaluate them individually:</p><pre><code>minimap -t 24 assembly.fasta long_reads.fastq.gz | racon -t 24 long_reads.fastq.gz - assembly.fasta racon_assembly.fasta
nucmer -p nucmer assembly.fasta racon_assembly.fasta
show-snps -C -T -r nucmer.delta
</code></pre><p>This reports Racon's changes in a table. You can exclude indels with the&nbsp;<code>-I</code>&nbsp;option in&nbsp;<code>show-snps</code>.&nbsp;</p><p>This process (Racon -&gt; MUMmer -&gt; SNP table) solves the problem I originally raised in this issue. So as far as I'm concerned, you can close this issue (or keep it open if you still want to implement some kind of variant table).</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/14050/assistant-professor-in-bioinformatics-at-indian-institute-of-technology-delhi</guid>
  <pubDate>Fri, 15 Aug 2014 06:16:06 -0500</pubDate>
  <link></link>
  <title><![CDATA[Assistant Professor 	in Bioinformatics at Indian Institute of Technology Delhi]]></title>
  <description><![CDATA[
<p>Indian Institute of Technology Delhi Hauz Khas ,New Delhi – 110016</p>

<p>ROLLING ADVERTISEMENT NO. 01/2014(E-1)<br />ADVERTISEMENT FOR THE POSITIONS OF ASSISTANT PROFESSOR CANDIDATES CAN APPLY ANY TIME DURING THE YEAR.</p>

<p>IIT Delhi invites applications from qualified Indian Nationals, Persons of Indian Origin (PIOs) and Overseas Citizens of India (OCIs) for the following positions in the various Departments/Centres/Schools (in the fields<br />mentioned alongwith them):<br />Post Pay Band Assistant Professor and Assistant Professor (on Contract) Rs.15600-39100 (PB-3) (Minimum pay of Rs.30000/-)+ AGP Rs.8000/-</p>

<p>The following norms will be followed for fixing the basic pay + AGP for Assistant Professors appointed on<br />contract with Ph.D but experience of 3 years or less:-<br />Type Qualification &amp; Experience on the date of joining<br />Assistant Professor (Contract) PB3 (Rs. 15,600-39,100).</p>

<p>MINIMUM QUALIFICATIONS AND EXPERIENCE:<br />Ph.D. with First class at the preceding degree or equivalent in the appropriate branch with very good academic record throughout. A minimum of three years industrial/research/teaching experience, excluding however, the experience gained while Pursuing Ph. D. The candidates should preferably be below<br />35 years of age for male and 38 years for female ( to be relaxed by 5 years in case of persons with physical disability, SC/ST and 3 years in case of OBC-NCL).</p>

<p>Qualified persons include:<br />(a) Indian Nationals,<br />(b) Foreign Nationals who are “Persons of Indian Origin” (PIO) or Overseas<br />Citizens of India (OCI), in whose case, if selected, permission will be sought from Govt. of India<br />before he/she can join IIT Delhi, or<br />(c) Other Foreign Nationals, in whose case, if selected, appointment will be on a contract basis for up to 5 (five) years subject to permission from the Govt. of India before he/she can join IIT Delhi.<br />(d) Institute specifically encourages applicants from SC/ST/OBC category as well as persons<br />with disability to apply for these positions. </p>

<p>AMAR NATH &amp; SHASHI KHOSLA SCHOOL OF INFORMATION TECHNOLOGY:<br />Computational Neuroscience, Medical Applications of Information Technologies, Computational &amp; Systems Biology, Machine to Machine (M2M) Technologies, Embedded Systems &amp; Sensors, Computer Security.<br />KUSUMA SCHOOL OF BIOLOGICAL SCIENCES:<br />In-silico Biology Applications, Systems Biology, Infection Biology, Neurodegeneration. </p>

<p>More at http://www.iitd.ac.in/sites/default/files/jobs/faculty/spl-areas-rolling-advt.pdf</p>

<p>http://www.iitd.ac.in/content/faculty-positions</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38063/referee-genome-assembly-quality-scores</guid>
	<pubDate>Sun, 04 Nov 2018 16:44:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38063/referee-genome-assembly-quality-scores</link>
	<title><![CDATA[Referee: Genome assembly quality scores]]></title>
	<description><![CDATA[<p>Modern genome sequencing technologies provide a succint measure of quality at each position in every read, however all of this information is lost in the assembly process. Referee summarizes the quality information from the reads that map to a site in an assembled genome to calculate a quality score for each position in the genome assembly.</p>
<p>We accomplish this by first calculating genotype likelihoods for every site. For a given site in a diploid genome, there are 10 possible genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT). Referee takes as input the genotype likelihoods calculated for all 10 genotypes given the called reference base at each position.</p>
<h3>Referee is a program to calculate a quality score for every position in a genome assembly. This allows for easy filtering of low quality sites for any downstream analysis.</h3>
<p>https://github.com/gwct/referee</p><p>Address of the bookmark: <a href="https://gwct.github.io/referee/#" rel="nofollow">https://gwct.github.io/referee/#</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/14272/lecturersenior-lecturer-level-bc-in-bioinformatics</guid>
  <pubDate>Fri, 22 Aug 2014 12:45:52 -0500</pubDate>
  <link></link>
  <title><![CDATA[Lecturer/Senior Lecturer (Level B/C) in Bioinformatics]]></title>
  <description><![CDATA[
<p>Lecturer/Senior Lecturer (Level B/C) in Synthetic Biology, Research Fellow (Level B) in Synthetic Biology &amp; Lecturer/Senior Lecturer (Level B/C) in Bioinformatics</p>

<p>Apply now Job no: 494553<br />Work type: Continuing full time<br />Vacancy type: External Vacancy, Internal Vacancy<br />Categories: Academic - Teaching and Research</p>

<p>The Faculty of Science is launching a new and innovative branch of biological science at Macquarie University – Synthetic Biology. Synthetic biology combines engineering principles with molecular biological approaches to design and construct biological devices and systems. Recent highlights in this field include the design and synthesis of a functional bacterial genome and a yeast chromosome, and generation of synthetic bacterial cells. The rational synthesis of "designer" organisms yield important insights into how organisms work and has the potential to revolutionise biotechnological applications in areas such as bioenergy and biomanufacturing.</p>

<p>Find more at http://jobs.mq.edu.au/cw/en/job/494553/lecturersenior-lecturer-level-bc-in-synthetic-biology-research-fellow-level-b-in-synthetic-biology-lecturersenior-lecturer-level-bc-in-bioinformatics</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39726/jackalope-a-swift-versatile-phylogenomic-and-high-throughput-sequencing-simulator</guid>
	<pubDate>Fri, 26 Jul 2019 00:58:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39726/jackalope-a-swift-versatile-phylogenomic-and-high-throughput-sequencing-simulator</link>
	<title><![CDATA[jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator]]></title>
	<description><![CDATA[<p><code>jackalope</code> simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina and Pacific Biosciences (PacBio) platforms. It can either read reference genomes from FASTA files or simulate new ones. Genomic variants can be simulated using summary statistics, phylogenies, Variant Call Format (VCF) files, and coalescent simulations&mdash;the latter of which can include selection, recombination, and demographic fluctuations. <code>jackalope</code> can simulate single, paired-end, or mate-pair Illumina reads, as well as reads from Pacific Biosciences These simulations include sequencing errors, mapping qualities, multiplexing, and optical/PCR duplicates. All outputs can be written to standard file formats.</p>
<p><span>A swift, versatile phylogenomic and high-throughput sequencing simulator </span> <span><a href="https://jackalope.lucasnell.com">https://jackalope.lucasnell.com</a></span></p><p>Address of the bookmark: <a href="https://github.com/lucasnell/jackalope" rel="nofollow">https://github.com/lucasnell/jackalope</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/14758/phd-opportunity-at-universite-de-liege-belgium</guid>
  <pubDate>Mon, 01 Sep 2014 17:16:22 -0500</pubDate>
  <link></link>
  <title><![CDATA[PhD opportunity at Université de Liège - Belgium]]></title>
  <description><![CDATA[
<p>The Bioinformatics and Systems Biology Unit of Université de Liège (Belgium) is looking for a highly motivated master student with programming skills for a PhD thesis project (4 years, fully funded) with the goal of designing computational tools that use literature, genomic and structural data in order to infer regulatory and metabolic networks.  </p>

<p>Applicants are invited to send their resume and a recommendation letter to Prof. Patrick Meyer (more details at   www.biosys.ulg.ac.be )</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40699/kevler-reference-free-variant-discovery-in-large-eukaryotic-genomes</guid>
	<pubDate>Tue, 28 Jan 2020 03:21:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40699/kevler-reference-free-variant-discovery-in-large-eukaryotic-genomes</link>
	<title><![CDATA[Kevler: Reference-free variant discovery in large eukaryotic genomes]]></title>
	<description><![CDATA[<p><span>Welcome to&nbsp;</span><span>kevlar</span><span>, software for predicting&nbsp;</span><em>de novo</em><span>&nbsp;genetic variants without mapping reads to a reference genome! kevlar's&nbsp;</span><em>k</em><span>-mer abundance based method calls single nucleotide variants (SNVs), multinucleotide variants (MNVs), insertion/deletion variants (indels), and structural variants (SVs) simultaneously with a single simple model.&nbsp;</span></p>
<p><span>More at&nbsp;<a href="https://kevlar.readthedocs.io/en/latest/">https://kevlar.readthedocs.io/en/latest/</a></span></p>
<p><span><a href="https://www.cell.com/iscience/pdf/S2589-0042(19)30259-7.pdf">https://www.cell.com/iscience/pdf/S2589-0042(19)30259-7.pdf</a></span></p><p>Address of the bookmark: <a href="https://github.com/kevlar-dev/kevlar" rel="nofollow">https://github.com/kevlar-dev/kevlar</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/14905/internship-in-computational-biology</guid>
  <pubDate>Thu, 04 Sep 2014 04:19:40 -0500</pubDate>
  <link></link>
  <title><![CDATA[Internship in Computational Biology]]></title>
  <description><![CDATA[
<p>We are looking for a motivated and autonomous intern to study gene expression in hybrid organisms. The student will work on natural hybrids of two or three different species of fungal endosymbionts of grasses. The pupose of this project is to build software allowing us to identify the genomic origin of expressed genes. To do that, the intern will have to analyze expression data (from RNA-seq) to find SNPs on the sequenced mRNAs allowing to identify from which of the parental genome the expressed gene come from. The data will have to be saved in a database using the standard BioSQL schema.</p>

<p>This job will allow the intern to become more familiar with new biological and bioinformatics tools like next generation sequencing, RNA-Seq data analysis and comparative genomics.</p>

<p>To apply for this position, send the following documents (in PDF format) to Dr Pierre-Yves Dupont (email p.y.dupont@massey.ac.nz):</p>

<p>1. A short cover letter.<br />2. A curriculum vitae, with transcript details.<br />3. The names and contact details of two referees willing to provide a confidential letter of recommendation upon request.</p>

<p>Informal enquiries are welcome. Formal applications are due by Sunday 2nd December 2012.<br />Requirements: </p>

<p>This position requires a good understanding of genetic problems, a good command of at least one scripting language (Perl, Python...), a basic knowledge of MySQL or any relational database management system. Knowledge in biological programming libraries (BioPython, BioPerl, BioRuby...), Java, C++ or any compiled language is an asset but not required. Undergraduate or Master degree is required.<br />Contact Information: </p>

<p>Dr. Pierre-Yves Dupont<br />Institute of Molecular BioSciences<br />Massey University<br />Private Bag 11 222<br />Palmerston North 4442<br />NEW ZEALAND</p>

<p>http://massey.genomicus.com/<br />p.y.dupont@massey.ac.nz</p>

<p>Information about the Institute of Molecular BioSciences (http://imbs.massey.ac.nz/) and the Computational Biology Research Group (http://massey.genomicus.com/) is available online. For more information about the position, you can contact Dr Pierre-Yves Dupont (email p.y.dupont@massey.ac.nz).</p>
]]></description>
</item>

</channel>
</rss>