<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38623?offset=40</link>
	<atom:link href="https://bioinformaticsonline.com/related/38623?offset=40" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44707/rna-seq-analysis-a-guide-for-bioinformaticians</guid>
	<pubDate>Sat, 07 Dec 2024 22:22:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44707/rna-seq-analysis-a-guide-for-bioinformaticians</link>
	<title><![CDATA[RNA-Seq Analysis: A Guide for Bioinformaticians]]></title>
	<description><![CDATA[<p>RNA sequencing (RNA-Seq) has revolutionized transcriptomics, offering unprecedented insights into gene expression, splicing, and transcript diversity. For bioinformaticians, RNA-Seq analysis is a gateway to exploring the complexity of RNA biology and its implications in health and disease. This blog post provides an overview of RNA-Seq analysis, key computational steps, and tools for bioinformaticians eager to delve into this powerful technique.</p><h3>What is RNA-Seq?</h3><p>RNA-Seq is a next-generation sequencing (NGS) technology used to study the transcriptome&mdash;the complete set of RNA molecules in a cell. It quantifies gene expression, detects novel transcripts, and captures alternative splicing events with high sensitivity and resolution.</p><h3>Workflow for RNA-Seq Analysis</h3><p>RNA-Seq analysis involves several stages, each requiring computational tools and expertise.</p><h4>1. <strong>Experimental Design and Data Acquisition</strong></h4><p>Before diving into analysis, bioinformaticians should consider:</p><ul>
<li><strong>Biological Replicates</strong>: Ensure statistical power to detect meaningful differences.</li>
<li><strong>Sequencing Depth</strong>: Align sequencing depth to study objectives (e.g., higher depth for low-abundance transcripts).</li>
<li><strong>Paired-End vs. Single-End</strong>: Paired-end sequencing provides more detailed information on transcript structure.</li>
</ul><p>Once sequencing is complete, raw data is provided in FASTQ format, containing sequence reads and quality scores.</p><h4>2. <strong>Quality Control and Preprocessing</strong></h4><p>Quality control (QC) ensures data integrity. Tools such as <strong>FastQC</strong> evaluate metrics like base quality, GC content, and adapter contamination.</p><p><strong>Preprocessing Steps</strong>:</p><ul>
<li><strong>Trimming</strong>: Tools like <strong>Trimmomatic</strong> or <strong>Cutadapt</strong> remove low-quality bases and adapter sequences.</li>
<li><strong>Filtering</strong>: Discard reads below a certain quality threshold or length.</li>
</ul><h4>3. <strong>Read Alignment</strong></h4><p>Reads are mapped to a reference genome or transcriptome to determine their origin. Alignment tools include:</p><ul>
<li><strong>HISAT2</strong>: Handles large genomes efficiently and supports spliced alignments.</li>
<li><strong>STAR</strong>: High-speed aligner optimized for RNA-Seq.</li>
<li><strong>Bowtie2</strong>: Suitable for short-read alignment.</li>
</ul><p><strong>Output</strong>: A SAM/BAM file containing aligned reads.</p><h4>4. <strong>Transcript Assembly and Quantification</strong></h4><p>This step involves identifying transcripts and quantifying their expression levels. Tools used include:</p><ul>
<li><strong>StringTie</strong>: Assembles and quantifies transcripts from aligned reads.</li>
<li><strong>Salmon/Kallisto</strong>: Perform pseudo-alignment for rapid and accurate quantification.</li>
</ul><p>Expression levels are typically measured as TPM (transcripts per million) or FPKM (fragments per kilobase of transcript per million mapped reads).</p><h4>5. <strong>Differential Expression Analysis</strong></h4><p>To identify genes with altered expression between conditions, bioinformaticians use tools such as:</p><ul>
<li><strong>DESeq2</strong>: Accounts for data normalization and variability.</li>
<li><strong>edgeR</strong>: Handles overdispersed count data efficiently.</li>
<li><strong>Limma-voom</strong>: Combines linear modeling with RNA-Seq count data.</li>
</ul><p>The output includes a list of differentially expressed genes (DEGs) with statistical significance and fold-change values.</p><h4>6. <strong>Functional Annotation and Pathway Analysis</strong></h4><p>Understanding the biological significance of DEGs involves:</p><ul>
<li><strong>Gene Ontology (GO) Analysis</strong>: Tools like <strong>DAVID</strong> or <strong>clusterProfiler</strong> categorize genes based on their biological functions.</li>
<li><strong>Pathway Enrichment Analysis</strong>: Identifies pathways enriched in DEGs using tools like <strong>KEGG</strong>, <strong>Reactome</strong>, or <strong>GSEA</strong>.</li>
</ul><h4>7. <strong>Visualization</strong></h4><p>Visualizing results enhances interpretability. Common visualizations include:</p><ul>
<li><strong>Heatmaps</strong>: Show expression patterns across samples (e.g., <strong>pheatmap</strong>).</li>
<li><strong>Volcano Plots</strong>: Highlight significant DEGs (e.g., <strong>ggplot2</strong>).</li>
<li><strong>PCA/UMAP</strong>: Assess sample clustering and variability (e.g., <strong>Seurat</strong>).</li>
</ul><h3>Challenges in RNA-Seq Analysis</h3><ol>
<li><strong>Batch Effects</strong>: Technical variability can confound biological signals. Combat this with normalization techniques or batch-correction tools like <strong>ComBat</strong>.</li>
<li><strong>Low-Quality Samples</strong>: Poor-quality RNA impacts downstream analyses.</li>
<li><strong>Computational Complexity</strong>: RNA-Seq generates massive datasets, requiring robust computing resources and optimized pipelines.</li>
</ol><h3>Key Tools and Resources</h3><ul>
<li><strong>Bioconductor</strong>: A treasure trove of R packages for RNA-Seq analysis.</li>
<li><strong>Galaxy</strong>: A web-based platform for running RNA-Seq workflows.</li>
<li><strong>Nextflow/Snakemake</strong>: Workflow management tools to streamline analyses.</li>
</ul><h3>Applications of RNA-Seq</h3><p>RNA-Seq is used in diverse research areas, including:</p><ul>
<li><strong>Cancer Transcriptomics</strong>: Identifying tumor-specific expression profiles.</li>
<li><strong>Developmental Biology</strong>: Studying dynamic transcriptome changes.</li>
<li><strong>Drug Discovery</strong>: Screening genes modulated by therapeutic compounds.</li>
</ul><h3>Conclusion</h3><p>RNA-Seq analysis is a cornerstone of modern transcriptomics, offering bioinformaticians a versatile toolkit for unraveling gene expression and regulation. Mastering RNA-Seq workflows and tools empowers researchers to transform raw sequencing data into biological discoveries.</p><p>Whether you&rsquo;re investigating disease mechanisms, exploring cellular pathways, or developing new therapeutics, RNA-Seq is a powerful ally in your bioinformatics arsenal.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35437/dupradar-package</guid>
	<pubDate>Sun, 04 Feb 2018 14:28:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35437/dupradar-package</link>
	<title><![CDATA[dupRadar package]]></title>
	<description><![CDATA[<p><span>The&nbsp;</span><em>dupRadar</em><span>&nbsp;package gives an insight into the duplication problem by graphically relating the gene expression level and the duplication rate present on it. Thus, failed experiments can be easily identified at a glance</span></p><p>Address of the bookmark: <a href="https://bioconductor.org/packages/3.7/bioc/vignettes/dupRadar/inst/doc/dupRadar.html" rel="nofollow">https://bioconductor.org/packages/3.7/bioc/vignettes/dupRadar/inst/doc/dupRadar.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35918/scubat-scaffolding-contigs-using-blat-and-transcripts</guid>
	<pubDate>Tue, 13 Mar 2018 06:52:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35918/scubat-scaffolding-contigs-using-blat-and-transcripts</link>
	<title><![CDATA[SCUBAT: Scaffolding Contigs Using Blat And Transcripts]]></title>
	<description><![CDATA[<p><span>SCUBAT (Scaffolding Contigs Using BLAT And Transcripts) uses any set of transcripts to identify cases where a transcript is split over multiple genome fragments and attempts to use this information to scaffold the genome.</span></p><p>Address of the bookmark: <a href="https://github.com/elswob/SCUBAT" rel="nofollow">https://github.com/elswob/SCUBAT</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34292/automatic-filtering-trimming-error-removing-and-quality-control-for-fastq-data</guid>
	<pubDate>Mon, 13 Nov 2017 05:10:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34292/automatic-filtering-trimming-error-removing-and-quality-control-for-fastq-data</link>
	<title><![CDATA[Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data]]></title>
	<description><![CDATA[<p><span>Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data</span><br><code>AfterQC</code><span>&nbsp;can simply go through all fastq files in a folder and then output three folders:&nbsp;</span><span>good</span><span>,&nbsp;</span><span>bad</span><span>&nbsp;and&nbsp;</span><span>QC</span><span>&nbsp;folders, which contains good reads, bad reads and the QC results of each fastq file/pair.</span><br><span>Currently it supports processing data from HiSeq 2000/2500/3000/4000, Nextseq 500/550, MiniSeq...and other&nbsp;</span><a href="http://support.illumina.com/help/SequencingAnalysisWorkflow/Content/Vault/Informatics/Sequencing_Analysis/CASAVA/swSEQ_mCA_FASTQFiles.htm">Illumina 1.8 or newer formats</a></p><p>Address of the bookmark: <a href="https://github.com/OpenGene/AfterQC" rel="nofollow">https://github.com/OpenGene/AfterQC</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34504/minion-gc-an-r-script-to-do-some-qc-on-minion-data</guid>
	<pubDate>Sun, 03 Dec 2017 15:19:18 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34504/minion-gc-an-r-script-to-do-some-qc-on-minion-data</link>
	<title><![CDATA[MinION_GC: An R script to do some QC on MinION data]]></title>
	<description><![CDATA[<p><span>Other tools focus on getting data out of the fastq or fast5 files, which is slow and computationally intensive. The benefit of this approach is that it works on a single, small, .txt summary file. So it's a lot quicker than most other things out there: it takes about a minute to analyse a 4GB flowcell on my laptop.</span></p>
<p>https://github.com/roblanf/minion_qc</p><p>Address of the bookmark: <a href="https://github.com/roblanf/minion_qc" rel="nofollow">https://github.com/roblanf/minion_qc</a></p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36518/mix-combining-multiple-assemblies-from-ngs-data</guid>
	<pubDate>Tue, 08 May 2018 04:58:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36518/mix-combining-multiple-assemblies-from-ngs-data</link>
	<title><![CDATA[MIX: Combining multiple assemblies from NGS data]]></title>
	<description><![CDATA[<p>Mix is a tool that combines two or more draft assemblies, without relying on a reference genome and has the goal to reduce contig fragmentation and thus speed-up genome finishing. The proposed algorithm builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a path in the extension graph that maximizes the cumulative contig length.</p>
<p>The Mix algorithm, approach and results were published in BMC bioinformatics :&nbsp;<a href="http://www.biomedcentral.com/1471-2105/14/S15/S16">http://www.biomedcentral.com/1471-2105/14/S15/S16</a>.</p><p>Address of the bookmark: <a href="https://github.com/cbib/MIX" rel="nofollow">https://github.com/cbib/MIX</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37498/nextsv-a-meta-caller-for-structural-variants-from-low-coverage-long-read-sequencing-data</guid>
	<pubDate>Mon, 06 Aug 2018 17:24:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37498/nextsv-a-meta-caller-for-structural-variants-from-low-coverage-long-read-sequencing-data</link>
	<title><![CDATA[NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data]]></title>
	<description><![CDATA[<p>NextSV, a meta SV caller and a computational pipeline to perform SV calling from low coverage long-read sequencing data. NextSV integrates three aligners and three SV callers and generates two integrated call sets (sensitive/stringent) for different analysis purpose. The output of NextSV is in ANNOVAR-compatible bed format. Users can easily perform downstream annotation using ANNOVAR and disease gene discovery using Phenolyzer.</p>
<p>&nbsp;</p>
<h2>&nbsp;</h2><p>Address of the bookmark: <a href="https://github.com/Nextomics/NextSV" rel="nofollow">https://github.com/Nextomics/NextSV</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38172/bamview-a-free-interactive-display-of-read-alignments-in-bam-data-files</guid>
	<pubDate>Fri, 09 Nov 2018 13:43:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38172/bamview-a-free-interactive-display-of-read-alignments-in-bam-data-files</link>
	<title><![CDATA[BamView: a free interactive display of read alignments in BAM data files]]></title>
	<description><![CDATA[<p>To run the application on UNIX from the downloaded jar file run the UNIX:</p>
<p><tt>java -mx512m -jar BamView.jar</tt></p>
<p>and extra command line options are given when '-h' is used:</p>
<p><tt>java -jar BamView.jar -h</tt></p>
<p>BAM files can be specified on the command line with the '-a' option:</p>
<p><tt>java -mx512m -jar BamView.jar -a pathToFile/sorted.bam</tt></p>
<p>If a BAM filename is not given on the command line BamView will prompt for a file to be entered. The BAM index file should have the same name as the BAM file but with a '.bai' suffix. Multiple BAM files can be loaded and overlaid in the viewer. To make this easier BamView will read in files that contain a list of filenames.</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="http://bamview.sourceforge.net/" rel="nofollow">http://bamview.sourceforge.net/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38598/zenbu-a-collaborative-omics-data-integration-and-interactive-visualization-system</guid>
	<pubDate>Fri, 04 Jan 2019 13:35:26 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38598/zenbu-a-collaborative-omics-data-integration-and-interactive-visualization-system</link>
	<title><![CDATA[ZENBU: a collaborative, omics data integration and interactive visualization system]]></title>
	<description><![CDATA[<p><span>ZENBU</span><span>&nbsp;</span><span>is a data integration, data analysis, and visualization system enhanced for RNAseq, ChipSeq, CAGE and other types of next-generation-sequence-tag (NGS) based data. ZENBU allows for novel data exploration through "on-demand" data processing and interactive linked-visualizations and is able to make many-views from the same primary sequence alignment data which users can uploaded from BAM, BED, GFF and tab-text files.&nbsp;<br>Please check our&nbsp;<a href="http://fantom.gsc.riken.jp/zenbu/wiki">documentation wiki</a>&nbsp;for details on how to use the system, or check out some of the views above.</span></p><p>Address of the bookmark: <a href="http://fantom.gsc.riken.jp/zenbu/" rel="nofollow">http://fantom.gsc.riken.jp/zenbu/</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40525/heatmaply-popular-graphical-method-for-visualizing-high-dimensional-data</guid>
	<pubDate>Sat, 11 Jan 2020 07:34:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40525/heatmaply-popular-graphical-method-for-visualizing-high-dimensional-data</link>
	<title><![CDATA[heatmaply: popular graphical method for visualizing high-dimensional data]]></title>
	<description><![CDATA[<p>This work is based on ggplot2 and plotly.js engine. It produces similar heatmaps as d3heatmap, with the advantage of speed (plotly.js is able to handle larger size matrix), and the ability to zoom from the dendrogram.</p>
<p>heatmaply also provides an interface based around the&nbsp;<a href="https://cran.r-project.org/package=plotly">plotly R package</a>. This interface can be used by choosing&nbsp;<code>plot_method = "plotly"</code>&nbsp;instead of the default&nbsp;<code>plot_method = "ggplot"</code>. This interface can provide smaller objects and faster rendering to disk in many cases and provides otherwise almost identical features.</p>
<p>Documentation for this package is also available as a&nbsp;<a href="https://cran.r-project.org/package=pkgdown">pkgdown</a>&nbsp;site:&nbsp;<a href="http://talgalili.github.io/heatmaply/">http://talgalili.github.io/heatmaply/</a></p><p>Address of the bookmark: <a href="http://talgalili.github.io/heatmaply/articles/heatmaply.html" rel="nofollow">http://talgalili.github.io/heatmaply/articles/heatmaply.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>