<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37602?offset=170</link>
	<atom:link href="https://bioinformaticsonline.com/related/37602?offset=170" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40212/kalign-fast-multiple-sequence-alignment-program-for-biological-sequences</guid>
	<pubDate>Fri, 01 Nov 2019 00:20:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40212/kalign-fast-multiple-sequence-alignment-program-for-biological-sequences</link>
	<title><![CDATA[Kalign: fast multiple sequence alignment program for biological sequences.]]></title>
	<description><![CDATA[<p><span>Kalign is a fast multiple sequence alignment program for biological sequences.</span></p>
<p>Align sequences and output the alignment in MSF format:</p>
<pre><code>kalign -i BB11001.tfa -f msf  -o out.msf
</code></pre>
<p>Align sequences and output the alignment in clustal format:</p>
<pre><code>kalign -i BB11001.tfa -f clu -o out.clu
</code></pre>
<p>Re-align sequences in an existing alignment:</p>
<pre><code>kalign -i BB11001.msf  -o out.afa
</code></pre>
<p>Reformat existing alignment:</p>
<pre><code>kalign -i BB11001.msf -r afa -o out.afa</code></pre><p>Address of the bookmark: <a href="https://github.com/TimoLassmann/kalign" rel="nofollow">https://github.com/TimoLassmann/kalign</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41959/rna-bloom-a-fast-and-memory-efficient-de-novo-transcript-sequence-assembler</guid>
	<pubDate>Thu, 09 Jul 2020 03:13:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41959/rna-bloom-a-fast-and-memory-efficient-de-novo-transcript-sequence-assembler</link>
	<title><![CDATA[RNA-Bloom: a fast and memory-efficient de novo transcript sequence assembler]]></title>
	<description><![CDATA[<p><strong>RNA-Bloom</strong><span>&nbsp;</span>is a fast and memory-efficient<span>&nbsp;</span><em>de novo</em><span>&nbsp;</span>transcript sequence assembler. It is designed for the following sequencing data types:</p>
<ul>
<li>single-end/paired-end bulk RNA-seq (strand-specific/agnostic)</li>
<li>paired-end single-cell RNA-seq (strand-specific/agnostic)</li>
<li>nanopore RNA-seq (PCR cDNA/direct cDNA/direct RNA)</li>
</ul>
<p>Written by<span>&nbsp;</span><a>Ka Ming Nip</a><span>&nbsp;</span>✉️</p><p>Address of the bookmark: <a href="https://github.com/bcgsc/RNA-Bloom" rel="nofollow">https://github.com/bcgsc/RNA-Bloom</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43639/fastv-detect-virus</guid>
	<pubDate>Sat, 11 Dec 2021 08:04:10 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43639/fastv-detect-virus</link>
	<title><![CDATA[fastv - detect virus]]></title>
	<description><![CDATA[<p><span>fastv is an ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data. It detects microbial sequences from FASTQ data, generates JSON reports and visualizes the result in HTML reports. This tool can be used to detect viral infectious diseases, like COVID-19. This tool supports both short reads (Illumina, BGI, etc.) and long reads (ONT, PacBio, etc.)</span></p><p>Address of the bookmark: <a href="https://github.com/OpenGene/fastv" rel="nofollow">https://github.com/OpenGene/fastv</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44641/heliano-a-fast-and-accurate-tool-for-detection-of-helitron-like-elements</guid>
	<pubDate>Tue, 13 Aug 2024 07:16:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44641/heliano-a-fast-and-accurate-tool-for-detection-of-helitron-like-elements</link>
	<title><![CDATA[HELIANO: A fast and accurate tool for detection of Helitron-like elements]]></title>
	<description><![CDATA[<p><span>Helitron-like elements (HLE1 and HLE2) are DNA transposons. They have been found in diverse species and seem to play significant roles in the evolution of host genomes. Although known for over twenty years, Helitron sequences are still challenging to identify. Here, we propose HELIANO (Helitron-like elements annotator) as an efficient solution for detecting Helitron-like elements.</span></p>
<p>https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae679/7730539?login=true</p><p>Address of the bookmark: <a href="https://github.com/Zhenlisme/heliano/" rel="nofollow">https://github.com/Zhenlisme/heliano/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26972/understanding-fastqc-output</guid>
	<pubDate>Fri, 15 Apr 2016 05:47:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26972/understanding-fastqc-output</link>
	<title><![CDATA[Understanding Fastqc Output]]></title>
	<description><![CDATA[<p>Understanding Following table and graphs</p>
<ol>
<li>Duplication level</li>
<li>kmer profile</li>
<li>per base GC content</li>
<li>per base N content</li>
<li>per base quality</li>
<li>per base sequence content</li>
<li>per sequence GC content</li>
<li>per sequence quality</li>
<li>sequence length distribution</li>
</ol>
<p>More at http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/</p><p>Address of the bookmark: <a href="http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/" rel="nofollow">http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36736/checkmassessing-the-quality-of-microbial-genomes-recovered-from-isolates-single-cells-and-metagenomes</guid>
	<pubDate>Wed, 23 May 2018 04:39:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36736/checkmassessing-the-quality-of-microbial-genomes-recovered-from-isolates-single-cells-and-metagenomes</link>
	<title><![CDATA[CheckM:Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes]]></title>
	<description><![CDATA[<p><span>CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. Assessment of genome quality can also be examined using plots depicting key genomic characteristics (e.g., GC, coding density) which highlight sequences outside the expected distributions of a typical genome. CheckM also provides tools for identifying genome bins that are likely candidates for merging based on marker set compatibility, similarity in genomic characteristics, and proximity within a reference genome tree.</span></p><p>Address of the bookmark: <a href="http://ecogenomics.github.io/CheckM/" rel="nofollow">http://ecogenomics.github.io/CheckM/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41669/filtlong-quality-filtering-tool-for-long-reads</guid>
	<pubDate>Wed, 13 May 2020 10:23:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41669/filtlong-quality-filtering-tool-for-long-reads</link>
	<title><![CDATA[Filtlong: quality filtering tool for long reads]]></title>
	<description><![CDATA[<p>Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.</p>
<p>Filtlong builds into a stand-alone executable:</p>
<pre><code>git clone https://github.com/rrwick/Filtlong.git
cd Filtlong
make -j
bin/filtlong -h
</code></pre><p>Address of the bookmark: <a href="https://github.com/rrwick/Filtlong" rel="nofollow">https://github.com/rrwick/Filtlong</a></p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/933/world-of-omics</guid>
	<pubDate>Tue, 16 Jul 2013 17:11:48 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/933/world-of-omics</link>
	<title><![CDATA[World of Omics]]></title>
	<description><![CDATA[<p>How many variants of "omics" techniques presently in use ?</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/4100/should-you-get-sequenced-not-all-bad-genes-predict-disease</guid>
	<pubDate>Thu, 29 Aug 2013 15:10:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/4100/should-you-get-sequenced-not-all-bad-genes-predict-disease</link>
	<title><![CDATA[Should you get sequenced? Not all bad genes predict disease]]></title>
	<description><![CDATA[<p><span>&ldquo;What we really don&rsquo;t know yet is whether the predictive aspects of the genome are going to turn out to be beneficial or potentially harmful&rdquo;</span></p>
<p><span><span>&ldquo;As we roll out genomic medicine we are fighting against this society-wide misconception that having the bad gene means you&rsquo;re going to get the disease. That&rsquo;s only true in a very few cases.&rdquo;</span></span></p>
<p><span><span><strong>Source</strong>:Today Health</span></span></p><p>Address of the bookmark: <a href="http://www.today.com/health/should-you-get-sequenced-not-all-bad-genes-predict-disease-8C11017154" rel="nofollow">http://www.today.com/health/should-you-get-sequenced-not-all-bad-genes-predict-disease-8C11017154</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>