<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37225?offset=220</link>
	<atom:link href="https://bioinformaticsonline.com/related/37225?offset=220" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41730/parliament2-runs-a-combination-of-tools-to-generate-structural-variant-calls-on-whole-genome-sequencing-data</guid>
	<pubDate>Thu, 28 May 2020 21:57:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41730/parliament2-runs-a-combination-of-tools-to-generate-structural-variant-calls-on-whole-genome-sequencing-data</link>
	<title><![CDATA[Parliament2: Runs a combination of tools to generate structural variant calls on whole-genome sequencing data]]></title>
	<description><![CDATA[<p>Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.</p>
<p>Parliament2 runs a combination of tools to generate structural variant calls on whole-genome sequencing data. It can run the following callers: Breakdancer, Breakseq2, CNVnator, Delly2, Manta, and Lumpy. Because of synergies in how the programs use computational resources, these are all run in parallel. Parliament2 will produce the outputs of each of the tools for subsequent investigation.</p><p>Address of the bookmark: <a href="https://github.com/dnanexus/parliament2" rel="nofollow">https://github.com/dnanexus/parliament2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/43227/project-associate-i-project-associate-ii-senior-project-associate-igib</guid>
  <pubDate>Thu, 05 Aug 2021 16:11:32 -0500</pubDate>
  <link></link>
  <title><![CDATA[Project Associate-I | Project Associate-II | Senior Project Associate @ IGIB]]></title>
  <description><![CDATA[
<p>Experience in Next Generation Sequencing (NGS) application and interest in Genomics/ Clinical / Translational Applications. OR Good computational programming skills and deep interest in working on interface of Genomics and Clinical application. </p>

<p>Project Scientist-I <br />Experimental / Computation analysis experience in highthroughput genomics/ clinical application.</p>

<p>Project Manager <br />Experience in handling large biological projects involving high-throughput genomics/ clinical application.</p>

<p>Scientific Administrative Assistant <br />Lab Work. </p>

<p>More at https://vinodscaria.genomes.in/positionsopen</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37674/qualimap2-evaluating-next-generation-sequencing-alignment-data</guid>
	<pubDate>Tue, 11 Sep 2018 04:44:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37674/qualimap2-evaluating-next-generation-sequencing-alignment-data</link>
	<title><![CDATA[Qualimap2: Evaluating next generation sequencing alignment data]]></title>
	<description><![CDATA[<p><strong>Qualimap 2</strong><span>&nbsp;is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.&nbsp;</span><br><br><span>Supported types of experiments include:</span></p>
<ul>
<li>Whole-genome sequencing</li>
<li>Whole-exome sequencing</li>
<li>RNA-seq (speical mode available)</li>
<li>ChIP-seq</li>
</ul><p>Address of the bookmark: <a href="http://qualimap.bioinfo.cipf.es/" rel="nofollow">http://qualimap.bioinfo.cipf.es/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39726/jackalope-a-swift-versatile-phylogenomic-and-high-throughput-sequencing-simulator</guid>
	<pubDate>Fri, 26 Jul 2019 00:58:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39726/jackalope-a-swift-versatile-phylogenomic-and-high-throughput-sequencing-simulator</link>
	<title><![CDATA[jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator]]></title>
	<description><![CDATA[<p><code>jackalope</code> simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina and Pacific Biosciences (PacBio) platforms. It can either read reference genomes from FASTA files or simulate new ones. Genomic variants can be simulated using summary statistics, phylogenies, Variant Call Format (VCF) files, and coalescent simulations&mdash;the latter of which can include selection, recombination, and demographic fluctuations. <code>jackalope</code> can simulate single, paired-end, or mate-pair Illumina reads, as well as reads from Pacific Biosciences These simulations include sequencing errors, mapping qualities, multiplexing, and optical/PCR duplicates. All outputs can be written to standard file formats.</p>
<p><span>A swift, versatile phylogenomic and high-throughput sequencing simulator </span> <span><a href="https://jackalope.lucasnell.com">https://jackalope.lucasnell.com</a></span></p><p>Address of the bookmark: <a href="https://github.com/lucasnell/jackalope" rel="nofollow">https://github.com/lucasnell/jackalope</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40715/mutatrix-a-population-genome-simulator-which-generates-simulated-genomes</guid>
	<pubDate>Tue, 28 Jan 2020 04:06:58 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40715/mutatrix-a-population-genome-simulator-which-generates-simulated-genomes</link>
	<title><![CDATA[mutatrix: a population genome simulator which generates simulated genomes.]]></title>
	<description><![CDATA[<p><span>genome simulation across a population with zeta-distributed allele frequency, snps, insertions, deletions, and multi-nucleotide polymorphisms</span></p>
<p><span>More at&nbsp;<a href="https://github.com/ekg/mutatrix">https://github.com/ekg/mutatrix</a></span></p>
<pre>./mutatrix -S sample -P test/ -p 2 -n 10 reference.fasta</pre><p>Address of the bookmark: <a href="https://github.com/ekg/mutatrix" rel="nofollow">https://github.com/ekg/mutatrix</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/18738/surrogate-variable-analysis-sva</guid>
	<pubDate>Thu, 30 Oct 2014 08:01:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/18738/surrogate-variable-analysis-sva</link>
	<title><![CDATA[Surrogate Variable Analysis (SVA)]]></title>
	<description><![CDATA[<p>The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways:</p><p>(1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS),</p><p>(2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and</p><p>(3) removing batch effects with known control probes (Leek 2014 biorXiv).</p><p>Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics).</p><p>More at http://www.bioconductor.org/packages/release/bioc/html/sva.html</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40889/rcorrector-efficient-and-accurate-error-correction-for-illumina-rna-seq-reads</guid>
	<pubDate>Tue, 04 Feb 2020 23:23:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40889/rcorrector-efficient-and-accurate-error-correction-for-illumina-rna-seq-reads</link>
	<title><![CDATA[Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads]]></title>
	<description><![CDATA[<p><span>Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from&nbsp;</span><a href="https://github.com/mourisl/Rcorrector/" target="_blank">https://github.com/mourisl/Rcorrector/</a><span>.</span></p>
<pre><code>Usage: perl run_rcorrector.pl [OPTIONS]
OPTIONS:
	Required
	-s seq_files: comma separated files for single-end data sets
	-1 seq_files_left: comma separated files for the first mate in the paried-end data sets
	-2 seq_files_right: comma separated files for the second mate in the paired-end data sets
	-i seq_files_interleaved: comma sperated files for interleaved paired-end data sets
	Optional
	-k INT: kmer_length (&lt;=32, default: 23)
	-od STRING: output_file_directory (default: ./)
	-t INT: number of threads to use (default: 1)
	-trim : allow trimming (default: false)
	-maxcorK INT: the maximum number of correction within k-bp window (default: 4)
	-wk FLOAT: the proportion of kmers that are used to estimate weak kmer count threshold, lower for more divergent genome (default: 0.95)
	-ek INT: expected number of kmers; does not affect the correctness of program but affects the memory usage (default: 100000000)
	-stdout: output the corrected reads to stdout (default: not used)
	-verbose: output some correction information to stdout (default: not used)
	-stage INT: start from which stage (default: 0)
		0-start from begining(storing kmers in bloom filter) ;
		1-start from count kmers showed up in bloom filter;
		2-start from dumping kmer counts into a jf_dump file;
		3-start from error correction.</code></pre><p>Address of the bookmark: <a href="https://github.com/mourisl/Rcorrector/" rel="nofollow">https://github.com/mourisl/Rcorrector/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/9032/encode-sequencing-data-freely-available-to-download-and-use-for-academic-means</guid>
	<pubDate>Thu, 13 Mar 2014 18:18:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/9032/encode-sequencing-data-freely-available-to-download-and-use-for-academic-means</link>
	<title><![CDATA[Encode sequencing data freely available to download and use for academic means]]></title>
	<description><![CDATA[<p>In <span style="text-decoration: underline;"><strong>Encode</strong></span>,&nbsp;<span>regulatory elements investigated via DNA hypersensitivity assays, assays of DNA methylation, and chromatin immunoprecipitation (ChIP) of proteins that interact with DNA, including modified histones and transcription factors, followed by sequencing (ChIP-Seq).</span></p>
<p><span>More information:</span></p>
<p><span>https://genome.ucsc.edu/ENCODE/pilot.html</span></p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://genome.ucsc.edu/ENCODE/" rel="nofollow">https://genome.ucsc.edu/ENCODE/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/23582/integrative-rna-and-chip-seq-analysis-of-regulatory-t-cells</guid>
	<pubDate>Tue, 04 Aug 2015 05:03:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/23582/integrative-rna-and-chip-seq-analysis-of-regulatory-t-cells</link>
	<title><![CDATA[Integrative RNA and ChIP-Seq analysis of regulatory T-cells]]></title>
	<description><![CDATA[<p><a href="http://www.strand-ngs.com/learn/white-papers#rna-chip" target="_blank" title="strand ngs white paper">Integrative RNA and ChIP-Seq analysis of regulatory T-cells&nbsp;</a><span>, a Strand NGS application note describes how integrated multi-omics functionality in Strand NGS was used to find the regulatory role of FoxP3 in T-regulatory and T-helper cells. Learn how the gene expression profiles from RNA-Seq and FoxP3 DNA-protein binding sites from ChIP-Seq are integrated. For mor information,&nbsp;</span><a href="http://www.strand-ngs.com/contact/sales" target="_blank" title="strand ngs contact">please write to us</a></p>]]></description>
	<dc:creator>Strand</dc:creator>
</item>

</channel>
</rss>