<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44234?offset=120</link>
	<atom:link href="https://bioinformaticsonline.com/related/44234?offset=120" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36597/gappadder-a-sensitive-approach-for-closing-gaps-on-draft-genomes-with-short-sequence-reads</guid>
	<pubDate>Mon, 14 May 2018 05:25:48 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36597/gappadder-a-sensitive-approach-for-closing-gaps-on-draft-genomes-with-short-sequence-reads</link>
	<title><![CDATA[GAPPadder: A Sensitive Approach for Closing Gaps on Draft Genomes with Short Sequence Reads]]></title>
	<description><![CDATA[<p><span>This software is provided ``as is&rdquo; without warranty of any kind. In no event shall the author be held responsible for any damage resulting from the use of this software. The program package, including source codes, executables, and this documentation, is distributed free of charge. If you use this program in a publication, please cite the following reference:</span><br><span>Chong Chu, Xin Li, and Yufeng Wu. "GAPPadder: A Sensitive Approach for Closing Gaps on Draft Genomes with Short Sequence Reads." bioRxiv (2017): 125534.</span></p><p>Address of the bookmark: <a href="https://github.com/Reedwarbler/GAPPadder" rel="nofollow">https://github.com/Reedwarbler/GAPPadder</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40940/consed-a-finishing-package-bam-file-viewer-assembly-editor-autofinish-autoreport-autoedit-and-align-reads-to-reference-sequence</guid>
	<pubDate>Fri, 07 Feb 2020 07:16:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40940/consed-a-finishing-package-bam-file-viewer-assembly-editor-autofinish-autoreport-autoedit-and-align-reads-to-reference-sequence</link>
	<title><![CDATA[Consed--A Finishing Package (BAM File Viewer, Assembly Editor, Autofinish, Autoreport, Autoedit, and Align Reads To Reference Sequence)]]></title>
	<description><![CDATA[<ul>
<li>Supports Illumina, 454, other Next-Gen and Sanger Reads and allows mixtures of these read types</li>
<li>Consed includes BamScape which can view bam files with unlimited numbers of reads. BamScape can bring up consed to edit reads and the reference sequence in targeted regions.</li>
<li>Consed is compatible with Newbler, Cross_match, Phrap, MIRA, Velvet and PCAP output.</li>
<li>Quickly takes the user to each variant site for viewing (also available as an automated report)</li>
<li>Overview of assembly can help detect and fix misassemblies</li>
<li>Editing time reduced by the program's ability to pin-point problem areas</li>
<li>Editing is guided by error probabilities</li>
</ul><p>Address of the bookmark: <a href="http://www.phrap.org/consed/consed.html" rel="nofollow">http://www.phrap.org/consed/consed.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26587/last</guid>
	<pubDate>Wed, 09 Mar 2016 14:27:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26587/last</link>
	<title><![CDATA[LAST]]></title>
	<description><![CDATA[<p style="text-align: center;"><img src="http://last.cbrc.jp/lastwebfig.png" alt="sketch of  similar regions in sequences" style="border: 0px;"></p>
<p>LAST can:</p>
<ul>
<li>Handle <strong>big</strong> sequence data, e.g:
<ul>
<li>Compare two vertebrate genomes</li>
<li>Align billions of DNA reads to a genome</li>
</ul>
</li>
<li>Indicate the <a href="http://lastweb.cbrc.jp/about.html">reliability</a> of each aligned column.</li>
<li>Use sequence quality data <a href="http://nar.oxfordjournals.org/content/38/7/e100.abstract">properly</a>.</li>
<li>Compare DNA to proteins, with frameshifts.</li>
<li>Compare PSSMs to sequences</li>
<li>Calculate the likelihood of chance similarities between random sequences.</li>
<li>Do split and spliced alignment.</li>
<li><a href="http://last.cbrc.jp/doc/last-train.html">Train</a> alignment parameters for unusual kinds of sequence (e.g. nanopore).</li>
</ul><p>Address of the bookmark: <a href="http://last.cbrc.jp/" rel="nofollow">http://last.cbrc.jp/</a></p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38310/sisrs-site-identification-from-short-read-sequences</guid>
	<pubDate>Wed, 28 Nov 2018 08:56:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38310/sisrs-site-identification-from-short-read-sequences</link>
	<title><![CDATA[SISRS: Site Identification from Short Read Sequences]]></title>
	<description><![CDATA[<p>Next-gen sequence data such as Illumina HiSeq reads. Data must be sorted into folders by taxon (e.g. species or genus). Paired reads in fastq format must be specified by _R1 and _R2 in the (otherwise identical) filenames. Paired and unpaired reads must have a fastq file extension.</p><p>Address of the bookmark: <a href="https://github.com/rachelss/SISRS" rel="nofollow">https://github.com/rachelss/SISRS</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43799/kast</guid>
	<pubDate>Wed, 23 Feb 2022 08:28:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43799/kast</link>
	<title><![CDATA[KAST]]></title>
	<description><![CDATA[<p><span>Perform Alignment-free k-tuple frequency comparisons from sequences. This can be in the form of two input files (e.g. a reference and a query) or a single file for pairwise comparisons to be made.</span></p><p>Address of the bookmark: <a href="https://github.com/martinjvickers/KAST" rel="nofollow">https://github.com/martinjvickers/KAST</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34734/smash-an-alignment-free-tool-to-find-and-visualise-rearrangements-between-pairs-of-dna-sequences</guid>
	<pubDate>Thu, 21 Dec 2017 08:26:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34734/smash-an-alignment-free-tool-to-find-and-visualise-rearrangements-between-pairs-of-dna-sequences</link>
	<title><![CDATA[SMASH: An alignment-free tool to find and visualise rearrangements between pairs of DNA sequences]]></title>
	<description><![CDATA[<p style="text-align: justify;"><span>SMASH is a completely alignment-free method to find and visualise rearrangements between pairs of DNA sequences</span>. The detection is based on&nbsp;<span>relative compression</span>, namely using a FCM, also known as Markov model, of high context order (typically 20). The method has been approached with a tool (also called SMASH). For visualization, SMASH outputs a SVG image, with an ideogram output architecture, where the patterns are represented with several HSV values (only value varies). The following image, illustrating the information maps between human and chimpanzee for the several chromosomes, depicts an example:</p>
<p><a href="https://github.com/pratas/smash/blob/master/imgs/HC.png" target="_blank"><img src="https://github.com/pratas/smash/raw/master/imgs/HC.png" alt="ScreenShot" style="border: 0px;"></a></p>
<p>&nbsp;</p>
<h2>&nbsp;</h2><p>Address of the bookmark: <a href="https://github.com/pratas/smash" rel="nofollow">https://github.com/pratas/smash</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37796/grsr-a-tool-for-deriving-genome-rearrangement-scenarios-from-multiple-unichromosomal-genome-sequences</guid>
	<pubDate>Fri, 28 Sep 2018 09:35:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37796/grsr-a-tool-for-deriving-genome-rearrangement-scenarios-from-multiple-unichromosomal-genome-sequences</link>
	<title><![CDATA[GRSR: a tool for deriving genome rearrangement scenarios from multiple unichromosomal genome sequences]]></title>
	<description><![CDATA[<p>GRSR is a Tool for Deriving Genome Rearrangement Scenarios for Multiple Uni-chromosomal Genomes. This tool will do the following steps:</p>
<ul>
<li>Step 1. Run mugsy to get multiple sequence alignment results.</li>
<li>Step 2 &amp; 3. Extraction of the Coordinates of Core Blocks, Construction of Synteny Blocks and Generating Signed Permutations.</li>
<li>Step 4. Generate pairwise genome rearrangement scenarios and find repeats at the breakpoints of each rearrangement events.</li>
<li></li>
<li></li>
</ul>
<p>https://github.com/DanwangJessica/GRSR</p><p>Address of the bookmark: <a href="https://github.com/DanwangJessica/GRSR" rel="nofollow">https://github.com/DanwangJessica/GRSR</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38452/silix-implements-an-ultra-efficient-algorithm-for-the-clustering-of-homologous-sequences</guid>
	<pubDate>Wed, 12 Dec 2018 09:22:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38452/silix-implements-an-ultra-efficient-algorithm-for-the-clustering-of-homologous-sequences</link>
	<title><![CDATA[SiLiX: implements an ultra-efficient algorithm for the clustering of homologous sequences]]></title>
	<description><![CDATA[<p>The software package SiLiX implements<strong>&nbsp;an ultra-efficient algorithm for the clustering of homologous sequences</strong>, based on single transitive links (<em>single linkage</em>) with alignment coverage constraints.</p>
<p>SiLiX adopts a graph-theoretical framework to interpret similarity pairs as edges of a network. A very efficient algorithm, based on the&nbsp;<em>Disjoint Sets Data Structure</em>, allows the computation of sequence families with&nbsp;<strong>low time and space requirements</strong>.</p>
<p><strong>A parallel version</strong>&nbsp;of SiLiX, based on MPI, is also available in this package and has been proved to be scalable, so that its allows the study of&nbsp;<strong>very large datasets</strong>.</p>
<p>SiLiX is already included in the analysis pipeline for&nbsp;<a href="http://pbil.univ-lyon1.fr/databases/hogenom/acceuil.php">HOGENOM</a>.</p><p>Address of the bookmark: <a href="http://lbbe.univ-lyon1.fr/SiLiX?lang=fr" rel="nofollow">http://lbbe.univ-lyon1.fr/SiLiX?lang=fr</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39624/cogent-a-tool-for-reconstructing-the-coding-genome-using-high-quality-full-length-transcriptome-sequences</guid>
	<pubDate>Tue, 18 Jun 2019 05:33:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39624/cogent-a-tool-for-reconstructing-the-coding-genome-using-high-quality-full-length-transcriptome-sequences</link>
	<title><![CDATA[Cogent: a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences.]]></title>
	<description><![CDATA[<div id="yui_3_14_1_1_1560853173251_3865">Cogent is a tool that identifies gene&nbsp;families and reconstructs the coding genome using high-quality transcriptome data without a reference genome, and can be used to check&nbsp;assemblies&nbsp;for the presence of&nbsp;these known coding sequences.</div>
<div>&nbsp;</div>
<div>
<p>Cogent is a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences. It is designed to be used on&nbsp;<a href="https://github.com/PacificBiosciences/cDNA_primer/wiki">Iso-Seq data</a>&nbsp;and in cases where there is no reference genome or the ref genome is highly incomplete.</p>
<p>See a&nbsp;<a href="https://www.dropbox.com/s/mn6hwhguh0pqceu/20160106_Cogent_developers_conference_slides_Cuttlefish.pdf?dl=0">recent presentation</a>&nbsp;on Cogent being applied to the Cuttlefish Iso-Seq data.</p>
<p><a href="https://www.dropbox.com/s/kz0gi7qg0w82k9a/20161026_Cogent_manuscript_forGitHub.pdf?dl=0">Cogent preliminary draft paper (updated 2016Dec version)</a>,&nbsp;<a href="https://www.dropbox.com/s/37412o8glvnfhf9/20161026_Cogent_ManuscriptPlusSupplement_forGitHub.pdf?dl=0">Supplementary</a></p>
<p>Please see&nbsp;<a href="https://github.com/Magdoll/Cogent/wiki">wiki</a>&nbsp;for details on usage.</p>
</div><p>Address of the bookmark: <a href="https://github.com/Magdoll/Cogent" rel="nofollow">https://github.com/Magdoll/Cogent</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>