<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/42946?offset=110</link>
	<atom:link href="https://bioinformaticsonline.com/related/42946?offset=110" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</guid>
	<pubDate>Fri, 26 Oct 2018 00:41:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</link>
	<title><![CDATA[COSINE: non-seeding method for mapping long noisy sequences]]></title>
	<description><![CDATA[<p><span>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors.</span></p><p>Address of the bookmark: <a href="https://github.com/SUwonglab/COSINE" rel="nofollow">https://github.com/SUwonglab/COSINE</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35057/ectools-long-read-correction-and-other-correction-tools</guid>
	<pubDate>Fri, 05 Jan 2018 04:02:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35057/ectools-long-read-correction-and-other-correction-tools</link>
	<title><![CDATA[ECTOOLS: Long Read Correction and other Correction tools]]></title>
	<description><![CDATA[<p>Long Read Correction and other Correction tools</p>
<p>This package is a loose collection of scripts. To run the correction<br>routine see the section below. Descriptions of the other scripts<br>are at the bottom of this file.</p>
<p>Contact: gurtowsk@cshl.edu</p>
<p>In short, the correction algorithm takes as input the unitigs from a short read assembly and uses them to correct long read data. More background information for the algorithm can be found:<br>http://schatzlab.cshl.edu/presentations/2013-06-18.PBUserMeeting.pdf</p><p>Address of the bookmark: <a href="https://github.com/jgurtowski/ectools" rel="nofollow">https://github.com/jgurtowski/ectools</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39640/flas-fast-and-high-throughput-algorithm-for-pacbio-long-read-self-correction</guid>
	<pubDate>Sat, 22 Jun 2019 12:16:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39640/flas-fast-and-high-throughput-algorithm-for-pacbio-long-read-self-correction</link>
	<title><![CDATA[FLAS: fast and high throughput algorithm for PacBio long read self-correction.]]></title>
	<description><![CDATA[<p><span>FLAS, a wrapper algorithm of MECAT, to achieve high throughput long read self-correction while keeping MECAT's fast speed. FLAS finds additional alignments from MECAT prealigned long reads to improve the correction throughput, and removes misalignments for accuracy.</span></p><p>Address of the bookmark: <a href="https://github.com/baoe/flas" rel="nofollow">https://github.com/baoe/flas</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26303/maker</guid>
	<pubDate>Sun, 07 Feb 2016 15:59:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26303/maker</link>
	<title><![CDATA[MAKER]]></title>
	<description><![CDATA[<p>MAKER is a portable and easily configurable genome annotation pipeline.Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.</p>
<p>More at http://www.yandell-lab.org/software/maker.html</p><p>Address of the bookmark: <a href="http://www.yandell-lab.org/software/maker.html" rel="nofollow">http://www.yandell-lab.org/software/maker.html</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27432/gkno</guid>
	<pubDate>Fri, 20 May 2016 18:56:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27432/gkno</link>
	<title><![CDATA[GKNO]]></title>
	<description><![CDATA[<p><span>gkno opens the world of complex bioinformatic analysis to people of all level of computational expertise. This site contains documentation, tutorials and information on all the tools that comprise gkno.</span></p>
<p><span>http://gkno.me/how-to/install.html</span></p>
<p><span>http://gkno.me/software.html</span></p><p>Address of the bookmark: <a href="http://gkno.me/" rel="nofollow">http://gkno.me/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32730/ncbi-prokaryotic-genome-annotation-pipeline</guid>
	<pubDate>Tue, 16 May 2017 08:56:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32730/ncbi-prokaryotic-genome-annotation-pipeline</link>
	<title><![CDATA[NCBI Prokaryotic Genome Annotation Pipeline]]></title>
	<description><![CDATA[<p>NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids).</p>
<p>Genome annotation is a multi-level process that includes prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs, pseudogenes, control regions, direct and inverted repeats, insertion sequences, transposons and other mobile elements.</p>
<p>NCBI has developed an automatic prokaryotic genome annotation pipeline that combines&nbsp;<em>ab initio</em>&nbsp;gene prediction algorithms with homology based methods. The first version of NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP;&nbsp;<a href="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=pubmed&amp;dopt=Abstract&amp;list_uids=18416670">see Pubmed Article</a>) developed in 2005 has been replaced with an upgraded version that is capable of processing a larger data volume. You can find a more detailed description of the new version of&nbsp;the pipeline in&nbsp;<a href="https://www.ncbi.nlm.nih.gov/books/NBK174280/">NCBI Handbook chapter</a>. NCBI's annotation pipeline depends on several internal databases and is not currently available for download or use outside of the NCBI environment.</p>
<p>https://www.ncbi.nlm.nih.gov/genome/annotation_prok/</p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/genome/annotation_prok/" rel="nofollow">https://www.ncbi.nlm.nih.gov/genome/annotation_prok/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38758/roary-the-pan-genome-pipeline</guid>
	<pubDate>Tue, 22 Jan 2019 05:52:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38758/roary-the-pan-genome-pipeline</link>
	<title><![CDATA[Roary: the Pan Genome Pipeline]]></title>
	<description><![CDATA[<p><span>Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome. Using a standard desktop PC, it can analyse datasets with thousands of samples, something which is computationally infeasible with existing methods, without compromising the quality of the results. 128 samples can be analysed in under 1 hour using 1 GB of RAM and a single processor. To perform this analysis using existing methods would take weeks and hundreds of GB of RAM. Roary is not intended for meta-genomics or for comparing extremely diverse sets of genomes.</span></p><p>Address of the bookmark: <a href="https://sanger-pathogens.github.io/Roary/" rel="nofollow">https://sanger-pathogens.github.io/Roary/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44539/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</guid>
	<pubDate>Wed, 15 May 2024 14:36:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44539/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</link>
	<title><![CDATA[Bactopia: a Flexible Pipeline for Complete Analysis of Bacterial Genomes]]></title>
	<description><![CDATA[<p dir="auto">Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is to process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!</p>
<p dir="auto">Bactopia can be split into two main parts:&nbsp;<a href="https://bactopia.github.io/latest/beginners-guide/">Bactopia Analysis Pipeline</a>, and&nbsp;<a href="https://bactopia.github.io/latest/bactopia-tools/">Bactopia Tools</a>.</p>
<p dir="auto">Bactopia Analysis Pipeline is the main&nbsp;<em>per-isolate</em>&nbsp;workflow in Bactopia. Built with&nbsp;<a href="https://www.nextflow.io/">Nextflow</a>, input FASTQs (local or available from SRA/ENA) are put through numerous analyses including: quality control, assembly, annotation, minmer sketch queries, sequence typing, and more.</p>
<p dir="auto"><a href="https://github.com/bactopia/bactopia/blob/master/data/bactopia-workflow.png" target="_blank"><img src="https://github.com/bactopia/bactopia/raw/master/data/bactopia-workflow.png" alt="Bactopia Overview" style="border: 0px;"></a></p>
<p dir="auto">Bactopia Tools are a set a independent workflows fo</p><p>Address of the bookmark: <a href="https://github.com/bactopia/bactopia" rel="nofollow">https://github.com/bactopia/bactopia</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/44741/bioinformatician-in-pipeline-development</guid>
  <pubDate>Tue, 17 Dec 2024 23:43:54 -0600</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatician in pipeline development]]></title>
  <description><![CDATA[
<p>Are you interested in working with pipeline development in bioinformatics, with the support of competent and friendly colleagues in an international environment? Are you looking for an employer that invests in sustainable employeeship and offers safe, favourable working conditions? We welcome you to apply for a position as Bioinformatician in pipeline development at Uppsala University.</p>

<p>National Bioinformatics Infrastructure Sweden (NBIS) (nbis.se) plays an important role in advancing life science research in Sweden by providing expert support and developing cutting-edge bioinformatics infrastructure. Operating as a truly national initiative, NBIS employs more than 120 bioinformaticians, system developers, and data stewards across multiple locations in Sweden. It serves as the bioinformatics platform at SciLifeLab, a national resource that facilitates research in molecular biosciences by offering access to state-of-the-art technologies and technical expertise. With strong ties to data-producing facilities and ongoing collaborations with leading research groups, NBIS is ideally positioned to support world-class bioinformatics analyses. Furthermore, NBIS is the Swedish node in ELIXIR, the European infrastructure for biological information.</p>

<p>NBIS is seeking an experienced bioinformatician to support both Swedish and international projects. As part of our dynamic team, you will work closely with researchers to process large-scale biological data and contribute to advancing our data analysis infrastructure. Strong problem-solving skills, attention to detail, and the ability to troubleshoot complex bioinformatics pipelines are essential for success in this role. Flexibility and a willingness to learn are also important, as NBIS continually adapts to meet the evolving needs of the Swedish research community.</p>

<p>More at https://www.uu.se/en/about-uu/join-us/jobs-and-vacancies/job-details?query=778701</p>
]]></description>
</item>

</channel>
</rss>