<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44476?offset=130</link>
	<atom:link href="https://bioinformaticsonline.com/related/44476?offset=130" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43804/agora-algorithm-for-gene-order-reconstruction-in-ancestors</guid>
	<pubDate>Mon, 28 Feb 2022 23:26:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43804/agora-algorithm-for-gene-order-reconstruction-in-ancestors</link>
	<title><![CDATA[AGORA: Algorithm for Gene Order Reconstruction in Ancestors]]></title>
	<description><![CDATA[<p dir="auto">AGORA stands for &ldquo;Algorithm for Gene Order Reconstruction in Ancestors&rdquo; and was developed by Matthieu Muffato in the DYOGEN Laboratory at the &Eacute;cole normale sup&eacute;rieure in Paris in 2008.</p>
<div>
<pre><code>    // | |     //   ) )  //   ) ) //   ) )  // | |
   //__| |    //        //   / / //___/ /  //__| |
  / ___  |   //  ____  //   / / / ___ (   / ___  |
 //    | |  //    / / //   / / //   | |  //    | |
//     | | ((____/ / ((___/ / //    | | //     | |
</code></pre>
</div>
<p dir="auto">AGORA is used to generate ancestral genomes for the&nbsp;<a href="https://www.genomicus.biologie.ens.fr/genomicus">Genomicus</a>&nbsp;online server for gene order comparison, and has been in constant use in the group since.</p><p>Address of the bookmark: <a href="https://github.com/DyogenIBENS/Agora" rel="nofollow">https://github.com/DyogenIBENS/Agora</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/33306/ancestral-sequence-reconstruction-asr-or-ancestral-genesequence-reconstructionresurrection-tools-to-study-molecular-evolution</guid>
	<pubDate>Tue, 30 May 2017 04:20:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/33306/ancestral-sequence-reconstruction-asr-or-ancestral-genesequence-reconstructionresurrection-tools-to-study-molecular-evolution</link>
	<title><![CDATA[Ancestral sequence reconstruction (ASR) or ancestral gene/sequence reconstruction/resurrection tools to study molecular evolution]]></title>
	<description><![CDATA[<p><span><strong>Ancestral sequence reconstruction</strong><span>&nbsp;(</span><strong>ASR</strong><span>) &ndash; also known as&nbsp;</span><strong>ancestral gene</strong><span>/</span><strong>sequence reconstruction</strong><span>/</span><strong>resurrection</strong><span>&nbsp;&ndash; is a technique used in the study of&nbsp;</span>molecular evolution<span>. The method consists of the synthesis of an ancestral&nbsp;</span>gene<span>&nbsp;and expression of the corresponding ancestral&nbsp;</span>protein<span>.&nbsp;</span><sup id="cite_ref-thornton_1-0"><a href="https://en.wikipedia.org/wiki/Ancestral_sequence_reconstruction#cite_note-thornton-1"></a></sup><span>The idea of protein 'resurrection' was suggested in 1963 by Pauling and Zuckerkandl.</span><sup id="cite_ref-2"><a href="https://en.wikipedia.org/wiki/Ancestral_sequence_reconstruction#cite_note-2"></a></sup><span>&nbsp;Some early efforts were made in the eighties-nineties, led by the laboratory of&nbsp;</span>Steven A. Benner<span>, showing the potential of this technique &ndash; one that only started to be fulfilled in the post-genomic era.</span><sup id="cite_ref-3"><a href="https://en.wikipedia.org/wiki/Ancestral_sequence_reconstruction#cite_note-3"></a></sup><span>&nbsp;Thanks to the improvement of algorithms and of better sequencing and synthesis techniques, the method was developed further in the early 2000s to allow the resurrection of a greater variety of and much more ancient genes.</span><sup id="cite_ref-4"><a href="https://en.wikipedia.org/wiki/Ancestral_sequence_reconstruction#cite_note-4"></a></sup><span>&nbsp;Over the last decade, ancestral protein resurrection has developed as a strategy to reveal the mechanisms and dynamics of protein evolution.&nbsp;</span></span></p><p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/ASR_phylogeny.png/510px-ASR_phylogeny.png" alt="image" width="610" height="435" style="border: 0px; border: 0px;"></p><p><span>Following are the list of&nbsp;</span><strong style="font-size: 12.8px;">Ancestral /sequence/ reconstruction</strong><span>&nbsp;(</span><strong style="font-size: 12.8px;">ASR</strong><span>) tools:&nbsp;</span></p><p><a href="http://www.bx.psu.edu/miller_lab/car/" target="_blank" title="To inferCars official website"><span>inferCars</span></a></p><p><span><span><span><span><span>Reconstructs contiguous regions of an ancestral genome. Given information about adjacencies between conserved segments in each modern species, our goal is to infer segment order in the ancestral genome. To get a clean and precise statement of the problem, we formalize it using graph theory. We develop an algorithm that identifies a most parsimonious scenario for the history of each individual adjacency, although the whole-genome prediction is not guaranteed to optimize traditional measures like the number of breakpoints. We introduce weights to the graph edges to model the reliability of each adjacency.</span></span></span></span></span></p><p><span><span><a href="http://paleogenomics.irmacs.sfu.ca/ANGES/" target="_blank" title="To ANGES official website">ANGES</a>:</span><a href="http://paleogenomics.irmacs.sfu.ca/ANGES/" target="_blank" title="To ANGES official website">reconstructing ANcestral GEnomeS maps</a></span></p><p><span><span><span><span><span><span>A suite of Python programs that allows reconstructing ancestral genome maps from the comparison of the organization of extant-related genomes. ANGES can reconstruct ancestral genome maps for multichromosomal linear genomes and unichromosomal circular genomes. It implements methods inspired from techniques developed to compute physical maps of extant genomes.</span></span></span></span></span></span></p><p><a href="http://virulence.molgen.mpg.de/cocos/" target="_blank" title="To Cocos official website"><span>Cocos</span></a></p><p><span><span><span><span><span><span><span>Constructs phylogenies of multi-domain proteins. With a given species tree and domain phylogenies, the procedure infers the composition of ancestral multi-domain proteins. Cocos implements and extend a suggested algorithmic approach by Behzadi and Vingron in an easy-to-use program. Such method could be applied to reconstruction of partial homologous units such as bacterial operons or protein complexes.</span></span></span></span></span></span></span></p><p><a href="https://github.com/msrosenberg/MySSP" target="_blank" title="To MySSP official website"><span>MySSP</span></a></p><p><span><span><span><span><span><span><span><span>Constructs an initial DNA sequence at the root of the tree and simulates evolution across the tree using a variety of common models of DNA evolution. MySSP is a program for the simulation of DNA sequence evolution across a phylogenetic tree. It is designed for large-scale studies, including simulation of multiple replicates and outputs sequences into NEXUS, MEGA, or FASTA formats. MySSP has a fairly simple graphical user interface (GUI) for basic use, but also has a specialized batch script interpreter to allow for more complicated or large-scale simulations.</span></span></span></span></span></span></span></span></p><p><span><span><a href="http://www.cs.cmu.edu/~ckingsf/software/parana/" target="_blank" title="To PARANA official website">PARANA</a>:&nbsp;</span><a href="http://www.cs.cmu.edu/~ckingsf/software/parana/" target="_blank" title="To PARANA official website">Parsimonious Ancestral Reconstruction And Network Analysis</a></span></p><p><span><span><span><span><span><span><span><span><span>Performs parsimony based inference of ancestral biological networks. Given multiple extant networks and phylogenetic information relating extant nodes, PARANA finds a parsimonious set of ancestral interaction events (edge gains and losses) which explain the extant networks. The framework adopted by PARANA is able to represent network evolution under models that support gene duplication and loss and independent interaction gain and loss. The method works on both directed and undirected networks and can incorporate asymmetric interaction gain and loss costs. In contrast to previous approaches, PARANA does not require knowing the relative ordering of unrelated duplication events and thus, works on phylogenetic trees even where branch lengths are not provided.</span></span></span></span></span></span></span></span></span></p><p><span><span><a href="http://www-labs.iro.umontreal.ca/~mabrouk/" target="_blank" title="To GapAdj official website">GapAdj</a>:&nbsp;</span><a href="http://www-labs.iro.umontreal.ca/~mabrouk/" target="_blank" title="To GapAdj official website">Gapped Adjacencies</a></span></p><p><span><span><span><span><span><span><span><span><span><span>A synteny-based method that is flexible enough to handle a model of evolution involving whole genome duplication events, in addition to rearrangements, gene insertions, and losses. Ancestral relationships between markers are defined in term of Gapped Adjacencies, i.e. pairs of markers separated by up to a given number of markers. It improves on a previous restricted to direct adjacencies, which revealed a high accuracy for adjacency prediction, but with the drawback of being overly conservative, i.e. of generating a large number of contiguous ancestral regions (CARs).</span></span></span></span></span></span></span></span></span></span></p><p><a href="http://ancestors.bioinfo.uqam.ca/"><span><span><span><span><span><span><span><span><span><span>ANCESTOR</span></span></span></span></span></span></span></span></span></span></a></p><p><span><span><span><span><span><span><span><span><span><span><span>A web server allowing one to easily and quickly perform the last three steps of the ancestral genome reconstruction procedure. Ancestors implements several alignment algorithms, an indel maximum likelihood solver and a context-dependent maximum likelihood substitution inference algorithm. The results presented by the server include the posterior probabilities for the last two steps of the ancestral genome reconstruction and the expected error rate of each ancestral base prediction.</span></span></span></span></span></span></span></span></span></span></span></p><p><a href="http://bioinfo.lifl.fr/procars/" target="_blank" title="To ProCARs official website"><span>ProCARs</span></a></p><p>Reconstructs ancestral gene orders as contiguous ancestral regions (CARs) with a progressive homology-based method. ProCARs runs from a phylogeny tree (without branch lengths needed) with a marked ancestor and a block file. This homology-based method is based on iteratively detecting and assembling ancestral adjacencies, while allowing some micro-rearrangements of synteny blocks at the extremities of the progressively assembled CARs. The method starts with a set of blocks as the initial set of CARs, and detects iteratively the potential ancestral adjacencies between extremities of CARs, while building up the CARs progressively by adding, at each step, new non-conflicting adjacencies that induce the less homoplasy phenomenon. The species tree is used, in some additional internal steps, to compute a score for the remaining conflicting adjacencies, and to detect other reliable adjacencies, in order to reach completely assembled ancestral genomes.</p><p><a href="http://fastml.tau.ac.il/" target="_blank" title="To FastML official website"><span>FastML</span></a></p><p>A user-friendly tool for the reconstruction of ancestral sequences. FastML implements various novel features that differentiate it from existing tools: (i) FastML uses an indel-coding method, in which each gap, possibly spanning multiples sites, is coded as binary data. FastML then reconstructs ancestral indel states assuming a continuous time Markov process. FastML provides the most likely ancestral sequences, integrating both indels and characters; (ii) FastML accounts for uncertainty in ancestral states: it provides not only the posterior probabilities for each character and indel at each sequence position, but also a sample of ancestral sequences from this posterior distribution, and a list of the k-most likely ancestral sequences; (iii) FastML implements a large array of evolutionary models, which makes it generic and applicable for nucleotide, protein and codon sequences; and (iv) a graphical representation of the results is provided, including, for example, a graphical logo of the inferred ancestral sequences.</p><p><a href="http://rth.dk/resources/maxAlike/" target="_blank" title="To maxAlike official website"><span>maxAlike</span></a></p><p>Reconstructs a genomic sequence for a specific taxon based on sequence homologs in other species. The input is a multiple sequence alignment and a phylogenetic tree that also contains the target species. For this target species, the algorithm computes nucleotide probabilities at each sequence position. Consensus sequences are then reconstructed based on a certain confidence level.</p><p><span><span><a href="http://www.geneorder.org/server.php" target="_blank" title="To MLGO official website">MLGO</a>:&nbsp;</span><a href="http://www.geneorder.org/server.php" target="_blank" title="To MLGO official website">Maximum Likelihood for Gene Order Analysis</a></span></p><p>A web tool for the reconstruction of phylogeny and/or ancestral genomes from gene-order data. MLGO was designed for analysis of large-scale genomic changes including not only rearrangements but also gene insertions, deletions and duplications. MLGO can be used to infer a phylogeny from genome rearrangement and gene order data, and can also obtain an estimation of ancestral genomes, given an input tree. MLGO takes the advantage of binary encoding on gene-order data, supports a fairly general model of genomic evolution (rearrangements plus duplications, insertions, and losses of genomic regions), and successfully accommodates itself into the framework of maximized likelihood.</p><p>Image Reference : Wiki</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36849/glean-an-unsupervised-learning-system-to-integrate-disparate-sources-of-gene-structure-evidence</guid>
	<pubDate>Sat, 02 Jun 2018 07:38:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36849/glean-an-unsupervised-learning-system-to-integrate-disparate-sources-of-gene-structure-evidence</link>
	<title><![CDATA[GLEAN: an unsupervised learning system to integrate disparate sources of gene structure evidence]]></title>
	<description><![CDATA[<p><span>GLEAN is an unsupervised learning system to integrate disparate sources of gene structure evidence (gene model predictions, EST/protein genomic sequence alignments, SAGE/peptide tags, etc) to produce a consensus gene prediction, without prior training.</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/glean-gene/" rel="nofollow">https://sourceforge.net/projects/glean-gene/</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38541/geneoverlap-an-r-package-to-test-and-visualize-gene-overlaps</guid>
	<pubDate>Thu, 27 Dec 2018 19:45:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38541/geneoverlap-an-r-package-to-test-and-visualize-gene-overlaps</link>
	<title><![CDATA[GeneOverlap: An R package to test and visualize gene overlaps]]></title>
	<description><![CDATA[<p>Overlapping gene lists can reveal biological meanings and may lead to novel hypotheses. For example, histone modification is an important cellular mechanism that can pack and re-pack chromatin. By making the chromatin structure more dense or loose, the gene expression can be turned on or off. Tri-methylation on lysine 4 of histone H3 (H3K4me3) is associated with gene activation and its genome-wide enrichment can be mapped by using ChIP-seq experiments. Because of its activating role, if we overlap the genes that are bound by H3K4me3 with the genes that are highly expressed, we should expect a positive association. Similary, we can perform such kind of overlapping between the gene lists of different histone modifications with that of various expression groups and establish each histone modification&rsquo;s role in gene regulation.</p><p>Address of the bookmark: <a href="https://bioconductor.org/packages/release/bioc/vignettes/GeneOverlap/inst/doc/GeneOverlap.pdf" rel="nofollow">https://bioconductor.org/packages/release/bioc/vignettes/GeneOverlap/inst/doc/GeneOverlap.pdf</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41969/shadowcaster-a-hybrid-approach-for-the-detection-of-horizontal-gene-transfer-events-in-prokaryotes</guid>
	<pubDate>Tue, 14 Jul 2020 06:42:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41969/shadowcaster-a-hybrid-approach-for-the-detection-of-horizontal-gene-transfer-events-in-prokaryotes</link>
	<title><![CDATA[ShadowCaster: a hybrid approach for the detection of horizontal gene transfer events in prokaryotes]]></title>
	<description><![CDATA[<p><span>ShadowCaster implements an evolutionary model to calculate Bayesian likelihoods for each &lsquo;alien genes&rsquo; with an unusual sequence composition according to the host genome background to detect HGT events in prokaryotes.</span></p>
<p><a href="https://www.mdpi.com/2073-4425/11/7/756/htm">https://www.mdpi.com/2073-4425/11/7/756/htm</a></p>
<p><a href="https://shadowcaster.readthedocs.io/en/latest/">https://shadowcaster.readthedocs.io/en/latest/</a></p>
<p><a href="https://github.com/dani2s/ShadowCaster_testData">https://github.com/dani2s/ShadowCaster_testData</a></p><p>Address of the bookmark: <a href="https://github.com/dani2s/ShadowCaster" rel="nofollow">https://github.com/dani2s/ShadowCaster</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44327/homologizer-phylogenetic-phasing-of-gene-copies-into-polyploid-subgenomes</guid>
	<pubDate>Sat, 03 Jun 2023 19:19:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44327/homologizer-phylogenetic-phasing-of-gene-copies-into-polyploid-subgenomes</link>
	<title><![CDATA[homologizer: Phylogenetic phasing of gene copies into polyploid subgenomes]]></title>
	<description><![CDATA[<p dir="auto">This tutorial describes the usage of&nbsp;<code>homologizer</code>&nbsp;to phase gene copies into polyploid subgenomes. The tutorial is an abbreviated version of a soon-to-be published paper in Methods in Molecular Biology. Please see that paper for many more details and practical considerations for running&nbsp;<code>homologizer</code>&nbsp;analyses. If you use&nbsp;<code>homologizer</code>, please cite the paper in which we first describe the method:</p>
<ul dir="auto">
<li>Freyman, W.A., Johnson, M.G., and C.J. Rothfels. 2022. Homologizer: phylogenetic phasing of gene copies into polyploid subgenomes.&nbsp;<em>bioRxiv</em>&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.10.22.351486v4">2020.10.22.351486v4</a></li>
</ul>
<p dir="auto"><code>homologizer</code>&nbsp;is implemented in&nbsp;<code>RevBayes</code>. Please see&nbsp;<a href="http://revbayes.com/">http://revbayes.com</a>&nbsp;to download and install&nbsp;<code>RevBayes</code>. For users without previous&nbsp;<code>RevBayes</code>&nbsp;experience, we recommend the tutorials at&nbsp;<a href="http://revbayes.com/">http://revbayes.com</a>.</p><p>Address of the bookmark: <a href="https://github.com/wf8/homologizer" rel="nofollow">https://github.com/wf8/homologizer</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/9673/now-time-is-come-to-revolutionize-amino-acid-sequencing-by-nanopore-technology</guid>
	<pubDate>Mon, 07 Apr 2014 08:01:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/9673/now-time-is-come-to-revolutionize-amino-acid-sequencing-by-nanopore-technology</link>
	<title><![CDATA[Now time is come to revolutionize amino acid sequencing by Nanopore technology]]></title>
	<description><![CDATA[<p>Amino acid sequencing by Nanopore recognition tunneling method</p><p>Address of the bookmark: <a href="http://www.eurekalert.org/multimedia/pub/71198.php" rel="nofollow">http://www.eurekalert.org/multimedia/pub/71198.php</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</guid>
	<pubDate>Tue, 22 Nov 2016 04:51:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29912/maq-mapping-and-assembly-with-quality</link>
	<title><![CDATA[Maq: Mapping and Assembly with Quality]]></title>
	<description><![CDATA[<p><strong>Maq</strong>&nbsp;stands for&nbsp;<em>Mapping and Assembly with Quality</em>&nbsp;It builds assembly by mapping short reads to reference sequences. Maq is a project hosted by&nbsp;<a href="http://sourceforge.net/">SourceForge.net</a>. The project page is available at<a href="http://sourceforge.net/projects/maq/">http://sourceforge.net/projects/maq/</a>. Maq is previously known as mapass2.</p>
<h2>Run Maq Now</h2>
<p>Follow these steps to try Maq. All you need is a reference sequence file in the FASTA format.</p>
<ol>
<li>Prepare a reference sequence (ref.fasta). Better a bacterial genome.</li>
<li>Download maq, maq-data and maqview at the&nbsp;<a href="http://sourceforge.net/project/showfiles.php?group_id=191815">download page</a>.</li>
<li>Copy maq, maq.pl and maq_eval.pl to the $PATH or to the same directory.</li>
<li>Simulate diploid reference and read sequences, map reads, call variants and evaluate the results in one go:
<pre>maq.pl demo ref.fasta calib-30.dat
</pre>
where&nbsp;<em>calib-30.dat</em>&nbsp;is contained in maq-data.</li>
<li>View the alignment:
<pre>cd maqdemo/easyrun;
maqindex -i -c consensus.cns all.map;
maqview -c consensus.cns all.map</pre>
</li>
</ol>
<p><strong>Even for advanced maq users, running `maq.pl demo' is recommended. You may find something helpful.</strong></p><p>Address of the bookmark: <a href="http://maq.sourceforge.net" rel="nofollow">http://maq.sourceforge.net</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37602/indexcov-fast-coverage-quality-control-for-whole-genome-sequencing</guid>
	<pubDate>Wed, 29 Aug 2018 09:20:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37602/indexcov-fast-coverage-quality-control-for-whole-genome-sequencing</link>
	<title><![CDATA[Indexcov: fast coverage quality control for whole-genome sequencing]]></title>
	<description><![CDATA[<p><em>indexcov</em><span>, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample.&nbsp;</span><em>Indexcov</em><span>&nbsp;is available at&nbsp;</span><a href="https://github.com/brentp/goleft" target="_blank">https://github.com/brentp/goleft</a><span>&nbsp;under the MIT license.</span></p><p>Address of the bookmark: <a href="https://github.com/brentp/goleft" rel="nofollow">https://github.com/brentp/goleft</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>