<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/41006?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/41006?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44707/rna-seq-analysis-a-guide-for-bioinformaticians</guid>
	<pubDate>Sat, 07 Dec 2024 22:22:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44707/rna-seq-analysis-a-guide-for-bioinformaticians</link>
	<title><![CDATA[RNA-Seq Analysis: A Guide for Bioinformaticians]]></title>
	<description><![CDATA[<p>RNA sequencing (RNA-Seq) has revolutionized transcriptomics, offering unprecedented insights into gene expression, splicing, and transcript diversity. For bioinformaticians, RNA-Seq analysis is a gateway to exploring the complexity of RNA biology and its implications in health and disease. This blog post provides an overview of RNA-Seq analysis, key computational steps, and tools for bioinformaticians eager to delve into this powerful technique.</p><h3>What is RNA-Seq?</h3><p>RNA-Seq is a next-generation sequencing (NGS) technology used to study the transcriptome&mdash;the complete set of RNA molecules in a cell. It quantifies gene expression, detects novel transcripts, and captures alternative splicing events with high sensitivity and resolution.</p><h3>Workflow for RNA-Seq Analysis</h3><p>RNA-Seq analysis involves several stages, each requiring computational tools and expertise.</p><h4>1. <strong>Experimental Design and Data Acquisition</strong></h4><p>Before diving into analysis, bioinformaticians should consider:</p><ul>
<li><strong>Biological Replicates</strong>: Ensure statistical power to detect meaningful differences.</li>
<li><strong>Sequencing Depth</strong>: Align sequencing depth to study objectives (e.g., higher depth for low-abundance transcripts).</li>
<li><strong>Paired-End vs. Single-End</strong>: Paired-end sequencing provides more detailed information on transcript structure.</li>
</ul><p>Once sequencing is complete, raw data is provided in FASTQ format, containing sequence reads and quality scores.</p><h4>2. <strong>Quality Control and Preprocessing</strong></h4><p>Quality control (QC) ensures data integrity. Tools such as <strong>FastQC</strong> evaluate metrics like base quality, GC content, and adapter contamination.</p><p><strong>Preprocessing Steps</strong>:</p><ul>
<li><strong>Trimming</strong>: Tools like <strong>Trimmomatic</strong> or <strong>Cutadapt</strong> remove low-quality bases and adapter sequences.</li>
<li><strong>Filtering</strong>: Discard reads below a certain quality threshold or length.</li>
</ul><h4>3. <strong>Read Alignment</strong></h4><p>Reads are mapped to a reference genome or transcriptome to determine their origin. Alignment tools include:</p><ul>
<li><strong>HISAT2</strong>: Handles large genomes efficiently and supports spliced alignments.</li>
<li><strong>STAR</strong>: High-speed aligner optimized for RNA-Seq.</li>
<li><strong>Bowtie2</strong>: Suitable for short-read alignment.</li>
</ul><p><strong>Output</strong>: A SAM/BAM file containing aligned reads.</p><h4>4. <strong>Transcript Assembly and Quantification</strong></h4><p>This step involves identifying transcripts and quantifying their expression levels. Tools used include:</p><ul>
<li><strong>StringTie</strong>: Assembles and quantifies transcripts from aligned reads.</li>
<li><strong>Salmon/Kallisto</strong>: Perform pseudo-alignment for rapid and accurate quantification.</li>
</ul><p>Expression levels are typically measured as TPM (transcripts per million) or FPKM (fragments per kilobase of transcript per million mapped reads).</p><h4>5. <strong>Differential Expression Analysis</strong></h4><p>To identify genes with altered expression between conditions, bioinformaticians use tools such as:</p><ul>
<li><strong>DESeq2</strong>: Accounts for data normalization and variability.</li>
<li><strong>edgeR</strong>: Handles overdispersed count data efficiently.</li>
<li><strong>Limma-voom</strong>: Combines linear modeling with RNA-Seq count data.</li>
</ul><p>The output includes a list of differentially expressed genes (DEGs) with statistical significance and fold-change values.</p><h4>6. <strong>Functional Annotation and Pathway Analysis</strong></h4><p>Understanding the biological significance of DEGs involves:</p><ul>
<li><strong>Gene Ontology (GO) Analysis</strong>: Tools like <strong>DAVID</strong> or <strong>clusterProfiler</strong> categorize genes based on their biological functions.</li>
<li><strong>Pathway Enrichment Analysis</strong>: Identifies pathways enriched in DEGs using tools like <strong>KEGG</strong>, <strong>Reactome</strong>, or <strong>GSEA</strong>.</li>
</ul><h4>7. <strong>Visualization</strong></h4><p>Visualizing results enhances interpretability. Common visualizations include:</p><ul>
<li><strong>Heatmaps</strong>: Show expression patterns across samples (e.g., <strong>pheatmap</strong>).</li>
<li><strong>Volcano Plots</strong>: Highlight significant DEGs (e.g., <strong>ggplot2</strong>).</li>
<li><strong>PCA/UMAP</strong>: Assess sample clustering and variability (e.g., <strong>Seurat</strong>).</li>
</ul><h3>Challenges in RNA-Seq Analysis</h3><ol>
<li><strong>Batch Effects</strong>: Technical variability can confound biological signals. Combat this with normalization techniques or batch-correction tools like <strong>ComBat</strong>.</li>
<li><strong>Low-Quality Samples</strong>: Poor-quality RNA impacts downstream analyses.</li>
<li><strong>Computational Complexity</strong>: RNA-Seq generates massive datasets, requiring robust computing resources and optimized pipelines.</li>
</ol><h3>Key Tools and Resources</h3><ul>
<li><strong>Bioconductor</strong>: A treasure trove of R packages for RNA-Seq analysis.</li>
<li><strong>Galaxy</strong>: A web-based platform for running RNA-Seq workflows.</li>
<li><strong>Nextflow/Snakemake</strong>: Workflow management tools to streamline analyses.</li>
</ul><h3>Applications of RNA-Seq</h3><p>RNA-Seq is used in diverse research areas, including:</p><ul>
<li><strong>Cancer Transcriptomics</strong>: Identifying tumor-specific expression profiles.</li>
<li><strong>Developmental Biology</strong>: Studying dynamic transcriptome changes.</li>
<li><strong>Drug Discovery</strong>: Screening genes modulated by therapeutic compounds.</li>
</ul><h3>Conclusion</h3><p>RNA-Seq analysis is a cornerstone of modern transcriptomics, offering bioinformaticians a versatile toolkit for unraveling gene expression and regulation. Mastering RNA-Seq workflows and tools empowers researchers to transform raw sequencing data into biological discoveries.</p><p>Whether you&rsquo;re investigating disease mechanisms, exploring cellular pathways, or developing new therapeutics, RNA-Seq is a powerful ally in your bioinformatics arsenal.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33912/mesquite-a-modular-system-for-evolutionary-analysis</guid>
	<pubDate>Tue, 18 Jul 2017 07:42:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33912/mesquite-a-modular-system-for-evolutionary-analysis</link>
	<title><![CDATA[Mesquite: A modular system for evolutionary analysis]]></title>
	<description><![CDATA[<p><span>Mesquite is modular, extendible software for evolutionary biology, designed to help biologists organize and analyze comparative data about organisms. Its emphasis is on phylogenetic analysis, but some of its modules concern population genetics, while others do non-phylogenetic multivariate analysis. Because it is modular, the analyses available depend on the modules installed.</span></p>
<p><span>http://mesquiteproject.wikispaces.com/</span></p><p>Address of the bookmark: <a href="https://github.com/MesquiteProject/MesquiteCore/releases" rel="nofollow">https://github.com/MesquiteProject/MesquiteCore/releases</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34465/rnaseq-data-analysis-links</guid>
	<pubDate>Mon, 27 Nov 2017 16:28:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34465/rnaseq-data-analysis-links</link>
	<title><![CDATA[RNAseq data analysis links !]]></title>
	<description><![CDATA[<p>RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping.</p><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4728800/" target="_blank">A survey of best practices for RNA-seq data analysis</a></p><p><a href="http://www.bioconductor.org/help/workflows/rnaseqGene/" target="_blank">RNA-seq workflow: gene-level exploratory analysis and DE</a></p><p><a href="https://github.com/crazyhottommy/RNA-seq-analysis" target="_blank">RNAseq analysis notes from Tommy Tang</a></p><p><a href="http://web.stanford.edu/group/wonglab/doc/RNA-seq-talk-JSM2010.pdf" target="_blank">Analysis of RNA ‐ Seq Data</a></p><p><a href="https://f1000research.com/articles/5-1408/v2" target="_blank">RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR</a></p><p><a href="http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html" target="_blank">Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.</a></p><p><a href="https://www.ebi.ac.uk/training/online/course/ebi-next-generation-sequencing-practical-course/rna-sequencing/rna-seq-analysis-transcriptome" target="_blank">EBI RNA-Seq exercise</a></p><p><a href="https://f1000research.com/articles/5-1574/v1" target="_blank">An open RNA-Seq data analysis pipeline tutorial with an example</a></p><p><a href="https://ycl6.gitbooks.io/rna-seq-data-analysis/rna-seq_analysis_workflow.html" target="_blank">RNA-Seq Analysis Workflow</a></p><p><a href="http://www.nature.com/nprot/journal/v11/n9/full/nprot.2016.095.html" target="_blank">Transcript-level expression analysis of RNA-seq experiments</a></p>]]></description>
	<dc:creator>Robert M Willioms</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/34916/bioinformatics-tools-developed-for-oxford-nanopore-data-analysis</guid>
	<pubDate>Wed, 27 Dec 2017 20:47:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/34916/bioinformatics-tools-developed-for-oxford-nanopore-data-analysis</link>
	<title><![CDATA[Bioinformatics tools developed for Oxford Nanopore data analysis !]]></title>
	<description><![CDATA[<p><span>MinION is the only portable real-time device for DNA and RNA&nbsp;</span><span>sequencing</span><span>. Each consumable flow cell can now generate 10&ndash;20 Gb of DNA&nbsp;</span><span>sequence</span><span>&nbsp;data. Ultra-</span><span>long read lengths are possible (hundreds of kb) as you can choose your fragment length.&nbsp;</span>One of the technical advantages of ONT data is the read length, which offers great prospects for genome assembly. Generally, assemblers are based on several different types of algorithms, such as greedy, overlap-layout-consensus (OLC), de Bruijn graph (DBG), and string graph.</p><p><span>List of analysis tools developed for Oxford Nanopore data</span></p><p>BWA <br />Fast nanopore data tuned alignment tool <br />https://github.com/lh3/bwa</p><p>GraphMap<br />Mapper for long and error-prone reads<br />https://github.com/isovic/graphmap</p><p>LAST<br />Nanopore tuned alignment tool<br />http://last.cbrc.jp/</p><p>LINKS<br />Software tool for long read scaffolding <br />https://github.com/warrenlr/LINKS/</p><p>marginAlign<br />Tools to align nanopore reads to a reference<br />https://github.com/benedictpaten/marginAlign</p><p>minoTour<br />Real time analysis tools<br />http://minotour.nottingham.ac.uk/</p><p>nanoCORR<br />Error-correction tool for nanopore sequence data<br />https://github.com/jgurtowski/nanocorr</p><p>NanoOK<br />Software for nanopore data, quality and error profiles<br />https://documentation.tgac.ac.uk/display/NANOOK/NanoOK</p><p>Nanopolish<br />Nanopore analysis and genome assembly software<br />https://github.com/jts/nanopolish</p><p>nanopore<br />Variant-detection tool for nanopore sequence data<br />https://github.com/mitenjain/nanopore</p><p>Nanocorrect<br />Error-correction tool for nanopore sequence data<br />https://github.com/jts/nanocorrect/</p><p>npReader<br />Real-time conversion and analysis of nanopore reads<br />https://github.com/mdcao/npReader</p><p>poRe<br />Tool for analyzing and visualizing nanopore data<br />https://sourceforge.net/p/rpore/wiki/Home/</p><p>PoreSeq<br />Error-correction and variant-calling software<br />https://github.com/tszalay/poreseq</p><p>Poretools<br />Nanopore sequence analysis and visualization software <br />https://github.com/arq5x/poretools</p><p>SSPACE-LongRead<br />Genome scaffolding tool <br />http://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE-longread</p><p>SMIS<br />Genome scaffolding tool <br />https://sourceforge.net/projects/phusion2/files/smis/</p><p>&nbsp;</p><p>List of assemblers for Oxford Nanopore MinION long reads</p><p>LQS<br />DALIGNER, Celera OLC Nanocorrect, <br />Nanopolish corrector<br />https://github.com/jts/nanopolish</p><p>PBcR<br />HGAP or BLASR, Celera OLC <br />PBcR corrector<br />http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR<br /> &ndash;<br />Canu<br />MHAP, Celera OLC <br />Canu corrector<br />https://github.com/marbl/canu</p><p>Falcon<br />String graph, Celera OLC <br />Falcon corrector<br />https://github.com/PacificBiosciences/falcon</p><p>Miniasm <br />OLC<br />https://github.com/lh3/miniasm</p><p>ra-integrate<br />OLC<br />https://github.com/mariokostelac/ra-integrate/</p><p>ALLPATHS-LG<br />de Bruijn graph <br />ALLPATHS-L corrector<br />https://www.broadinstitute.org/software/allpaths-lg/blog/?page_id=12</p><p>SPAdes <br />de Bruijn graph <br />SPAdes corrector<br />http://bioinf.spbau.ru/spades</p>]]></description>
	<dc:creator>biogeek</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35543/genometools-the-versatile-open-source-genome-analysis-software</guid>
	<pubDate>Wed, 07 Feb 2018 10:44:18 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35543/genometools-the-versatile-open-source-genome-analysis-software</link>
	<title><![CDATA[GenomeTools: The versatile open source genome analysis software]]></title>
	<description><![CDATA[<p>The&nbsp;<em>GenomeTools</em>&nbsp;genome analysis system is a&nbsp;<a href="http://genometools.org/license.html">free</a>&nbsp;collection of bioinformatics&nbsp;<a href="http://genometools.org/tools.html">tools</a>&nbsp;(in the realm of genome informatics) combined into a single binary named&nbsp;<em>gt</em>. It is based on a C library named &ldquo;libgenometools&rdquo; which consists of several modules.</p>
<p>If you are interested in gene prediction, have a look at&nbsp;<a href="http://genomethreader.org/" title="GenomeThreader gene prediction        software"><em>GenomeThreader</em></a>.</p><p>Address of the bookmark: <a href="http://genometools.org/" rel="nofollow">http://genometools.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38462/egad-ultra-fast-functional-analysis-of-gene-networks</guid>
	<pubDate>Fri, 14 Dec 2018 04:10:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38462/egad-ultra-fast-functional-analysis-of-gene-networks</link>
	<title><![CDATA[EGAD: Ultra-fast functional analysis of gene networks]]></title>
	<description><![CDATA[<p><span>With the EGAD (Extending &lsquo;Guilt-by-Association&rsquo; by Degree) package, we present a series of highly efficient tools to calculate functional properties in networks based on the guilt-by-association principle. These allow rapid controlled comparisons and analyses. Two of the core features are: a function prediction algorithm which is fully vectorized (neighbor_voting), allowing network characterization across even thousands of functional groups to be accomplished in minutes in cross-validation and an analytic determination of the optimal prior to guess candidates genes across multiple functional sets (calculate_multifunc, auc_multifunc).</span></p><p>Address of the bookmark: <a href="https://github.com/sarbal/EGAD" rel="nofollow">https://github.com/sarbal/EGAD</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/40882/troyanskaya-lab</guid>
  <pubDate>Tue, 04 Feb 2020 06:40:36 -0600</pubDate>
  <link></link>
  <title><![CDATA[Troyanskaya Lab]]></title>
  <description><![CDATA[
<p>The goal of our research is to interpret and distill this complexity through accurate analysis and modeling of molecular pathways, particularly those in which malfunctions lead to the manifestation of disease. We are inventing integrative methods for systems-level pathway modeling through integrative analysis of genome-scale datasets. We apply these approaches in studying challenging biological problems, such as how pathways function in diverse cell types and how they change dynamically.</p>

<p>https://function.princeton.edu/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41559/dahak-benchmarking-and-containerization-of-tools-for-analysis-of-complex-non-clinical-metagenomes</guid>
	<pubDate>Thu, 09 Apr 2020 04:56:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41559/dahak-benchmarking-and-containerization-of-tools-for-analysis-of-complex-non-clinical-metagenomes</link>
	<title><![CDATA[Dahak: benchmarking and containerization of tools for analysis of complex non-clinical metagenomes.]]></title>
	<description><![CDATA[<p><span>Dahak is a software suite that integrates state-of-the-art open source tools for metagenomic analyses. Tools in the dahak software suite will perform various steps in metagenomic analysis workflows including data pre-processing, metagenome assembly, taxonomic and functional classification, genome binning, and gene assignment. We aim to deliver the analytical framework as a robust and reliable containerized workflow system, which will be free from dependency, installation, and execution problems typically associated with other open-source bioinformatics solutions. This will maximize the transparency, data provenance (i.e., the process of tracing the origins of data and its movement through the workflow), and reproducibility.</span></p>
<p><span>More at&nbsp;<a href="https://dahak-metagenomics.github.io/dahak/">https://dahak-metagenomics.github.io/dahak/</a></span></p><p>Address of the bookmark: <a href="https://github.com/dahak-metagenomics/dahak" rel="nofollow">https://github.com/dahak-metagenomics/dahak</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43766/genometools-the-versatile-open-source-genome-analysis-software</guid>
	<pubDate>Wed, 02 Feb 2022 04:00:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43766/genometools-the-versatile-open-source-genome-analysis-software</link>
	<title><![CDATA[GenomeTools: The versatile open source genome analysis software]]></title>
	<description><![CDATA[<p>The&nbsp;<em>GenomeTools</em>&nbsp;genome analysis system is a&nbsp;<a href="http://genometools.org/license.html">free</a>&nbsp;collection of bioinformatics&nbsp;<a href="http://genometools.org/tools.html">tools</a>&nbsp;(in the realm of genome informatics) combined into a single binary named&nbsp;<em>gt</em>. It is based on a C library named &ldquo;libgenometools&rdquo; which consists of several modules.</p>
<p><img src="http://genometools.org/images/annotation.png" alt="image" style="border: 0px;"></p>
<p>If you are interested in gene prediction, have a look at&nbsp;<a href="http://genomethreader.org/" title="GenomeThreader gene prediction        software"><em>GenomeThreader</em></a>.</p>
<p>http://genometools.org/pub/</p><p>Address of the bookmark: <a href="http://genometools.org/" rel="nofollow">http://genometools.org/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44257/calculate-the-significance-of-the-difference-between-two-trends</guid>
	<pubDate>Tue, 14 Mar 2023 05:41:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44257/calculate-the-significance-of-the-difference-between-two-trends</link>
	<title><![CDATA[Calculate the significance of the difference between two trends]]></title>
	<description><![CDATA[<div><div><div><div><div><div><div><div><div><div><p>To calculate the significance of the difference between two trends, you can use a statistical test such as a t-test or ANOVA (analysis of variance). Here are the general steps to follow:</p><ol>
<li>
<p>Define your null hypothesis (H0) and alternative hypothesis (H1). For example, H0 might be that there is no significant difference between the two trends, while H1 might be that there is a significant difference.</p>
</li>
<li>
<p>Collect data on the two trends. Make sure that the data is independent, normally distributed, and has equal variances.</p>
</li>
<li>
<p>Calculate the means and standard deviations of each trend.</p>
</li>
<li>
<p>Calculate the test statistic using a t-test or ANOVA. The test statistic will depend on the specific test you choose, but it will generally compare the difference in means between the two trends to the variability within each trend.</p>
</li>
<li>
<p>Determine the p-value associated with the test statistic. The p-value represents the probability of obtaining a test statistic as extreme as the one you calculated, assuming that the null hypothesis is true.</p>
</li>
<li>
<p>Compare the p-value to your chosen significance level (usually 0.05 or 0.01). If the p-value is less than or equal to the significance level, reject the null hypothesis and conclude that there is a significant difference between the two trends. If the p-value is greater than the significance level, fail to reject the null hypothesis and conclude that there is not enough evidence to support a significant difference.</p>
</li>
</ol><p>It's important to note that the specific details of each step will depend on the type of test you choose and the software you use to perform the analysis.</p><p>The most common methods for comparing means include:</p><table>
<thead>
<tr><th>Methods</th><th>R function</th><th>Description</th></tr>
</thead>
<tbody>
<tr>
<td>T-test</td>
<td>t.test()</td>
<td>Compare two groups (parametric)</td>
</tr>
<tr>
<td>Wilcoxon test</td>
<td>wilcox.test()</td>
<td>Compare two groups (non-parametric)</td>
</tr>
<tr>
<td>ANOVA</td>
<td>aov() or anova()</td>
<td>Compare multiple groups (parametric)</td>
</tr>
<tr>
<td>Kruskal-Wallis</td>
<td>kruskal.test()</td>
<td>Compare multiple groups (non-parametric)<br /><br /></td>
</tr>
</tbody>
</table></div></div></div></div></div></div></div></div></div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>