<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43308?</link>
	<atom:link href="https://bioinformaticsonline.com/related/43308?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44803/basics-of-deseq2-differential-expression-made-simple</guid>
	<pubDate>Wed, 28 May 2025 06:47:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44803/basics-of-deseq2-differential-expression-made-simple</link>
	<title><![CDATA[Basics of DESeq2: Differential Expression Made Simple]]></title>
	<description><![CDATA[<p>DESeq2 is a powerful and widely-used R package that identifies differentially expressed genes (DEGs) from RNA-seq data. Whether you're comparing treated vs untreated samples, disease vs healthy conditions, or wild-type vs mutant strains, DESeq2 helps you statistically determine which genes are significantly up- or down-regulated.</p><p><strong>What Does DESeq2 Do?</strong><br />DESeq2 analyzes count data&mdash;the number of sequencing reads that map to each gene. It:</p><p>Normalizes the data to account for sequencing depth and library size.</p><p>Estimates variance (dispersion) for each gene.</p><p>Fits a model to compare groups (e.g., control vs treated).</p><p>Calculates fold-changes and p-values to determine significance.</p><p><strong>Installing DESeq2</strong></p><p><br />You can install DESeq2 via Bioconductor in R:</p><p>if (!requireNamespace("BiocManager", quietly = TRUE))<br /> install.packages("BiocManager")<br />BiocManager::install("DESeq2")</p><p><br />Inputs Needed</p><p><br />A count matrix: genes as rows, samples as columns (raw counts, not normalized).</p><p>A sample metadata table (also called colData): defines the condition/group for each sample.</p><blockquote><p>Example:<br /># Count matrix (rows = genes, columns = samples)<br />counts &lt;- read.csv("counts.csv", row.names = 1)</p><p># Sample metadata<br />colData &lt;- data.frame(<br /> row.names = colnames(counts),<br /> condition = c("control", "control", "treated", "treated")<br />)</p><p>DESeq2 Workflow</p><p>1. Load the package<br />library(DESeq2)<br />2. Create a DESeqDataSet object<br />dds &lt;- DESeqDataSetFromMatrix(countData = counts,<br /> colData = colData,<br /> design = ~ condition)<br />3. Run the differential expression analysis<br />dds &lt;- DESeq(dds)<br />4. Get the results<br />res &lt;- results(dds)<br />head(res)<br />This gives a table with:</p><p>log2FoldChange: how much expression changed</p><p>pvalue: statistical significance</p><p>padj: adjusted p-value (FDR corrected)</p></blockquote><p><strong>Visualization (Optional but Powerful)</strong></p><blockquote><p><br />MA Plot<br />plotMA(res, ylim = c(-2, 2))</p><p>Volcano Plot (custom)<br />library(ggplot2)<br />res$significant &lt;- res$padj &lt; 0.05<br />ggplot(res, aes(x=log2FoldChange, y=-log10(padj), color=significant)) +<br /> geom_point() +<br /> theme_minimal()</p><p>Heatmap of Top Genes<br />library(pheatmap)<br />topgenes &lt;- head(order(res$padj), 20)<br />vsd &lt;- vst(dds, blind=FALSE)<br />pheatmap(assay(vsd)[topgenes, ])</p><p>Tips for Best Results<br />Use raw counts (not normalized or TPM/RPKM values).</p><p>Have replicates: DESeq2 relies on variance estimates, so at least 3 per group is ideal.</p><p>Watch out for batch effects&mdash;include them in your design if needed (e.g., ~ batch + condition).</p></blockquote><p><strong>Summary</strong></p><p>Step Purpose<br />DESeqDataSetFromMatrix() Load your data into DESeq2<br />DESeq() Run the differential expression analysis<br />results() Extract the output (log fold change, p-values, etc.)<br />plotMA() / ggplot2 / pheatmap Visualize the results</p><p><strong>Final Thoughts</strong><br />DESeq2 is an essential tool for RNA-seq data analysis. It abstracts away much of the complexity of statistical modeling, while still giving you control when needed. Whether you're a bioinformatician or a wet-lab biologist, DESeq2 offers both ease of use and analytical power.</p><p>&nbsp;</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43447/rna-seq-workflow-gene-level-exploratory-analysis-and-differential-expression</guid>
	<pubDate>Sat, 09 Oct 2021 07:59:23 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43447/rna-seq-workflow-gene-level-exploratory-analysis-and-differential-expression</link>
	<title><![CDATA[RNA-seq workflow: gene-level exploratory analysis and differential expression]]></title>
	<description><![CDATA[<p><span>Here we walk through an end-to-end gene-level RNA-seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were quantified to the reference transcripts, and prepare gene-level count datasets for downstream analysis. We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results.</span></p><p>Address of the bookmark: <a href="http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html" rel="nofollow">http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/14221/bioinformatician-at-work</guid>
	<pubDate>Wed, 20 Aug 2014 22:20:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/14221/bioinformatician-at-work</link>
	<title><![CDATA[Bioinformatician at work !!!]]></title>
	<description><![CDATA[<p>The busy life of a bioinformatician. Need many more hands to deal with additional job :)</p>]]></description>
	<dc:creator>Neel</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/14221" length="1439785" type="image/png" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43999/tools-for-differential-expression-analysis</guid>
	<pubDate>Tue, 08 Nov 2022 03:40:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43999/tools-for-differential-expression-analysis</link>
	<title><![CDATA[Tools for Differential expression analysis]]></title>
	<description><![CDATA[<p><span>apeglm</span>&nbsp;-&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/apeglm.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/apeglm.html</a></p><p><span>ashr</span>&nbsp;-&nbsp;<a href="https://github.com/stephens999/ashr" target="_blank">https://github.com/stephens999/ashr</a>,&nbsp;<a href="https://cran.r-project.org/web/packages/ashr/index.html" target="_blank">https://cran.r-project.org/web/packages/ashr/index.html</a></p><p><span>consensusDE</span>&nbsp;-&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/consensusDE.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/consensusDE.html</a></p><p><span>DESeq2</span>&nbsp;-&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/DESeq2.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/DESeq2.html</a></p><p><span>edgeR</span>&nbsp;-&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/edgeR.html</a></p><p><span>limma</span>&nbsp;-&nbsp;<a href="https://kasperdanielhansen.github.io/genbioconductor/html/limma.html" target="_blank">https://kasperdanielhansen.github.io/genbioconductor/html/limma.html</a>&nbsp;&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/limma.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/limma.html</a></p><p><span>MetaCycle</span>&nbsp;-&nbsp;<a href="https://cran.r-project.org/web/packages/MetaCycle/index.html" target="_blank">https://cran.r-project.org/web/packages/MetaCycle/index.html</a>,&nbsp;<a href="https://github.com/gangwug/MetaCycle" target="_blank">https://github.com/gangwug/MetaCycle</a></p><p><span>RUVSeq</span>&nbsp;-&nbsp;<a href="https://bioconductor.org/packages/release/bioc/html/RUVSeq.html" target="_blank">https://bioconductor.org/packages/release/bioc/html/RUVSeq.html</a></p><p><span>SARTools</span>&nbsp;-&nbsp;<a href="https://github.com/PF2-pasteur-fr/SARTools" target="_blank">https://github.com/PF2-pasteur-fr/SARTools</a></p><p><span>tximport</span>&nbsp;-&nbsp;<a href="https://github.com/mikelove/tximport" target="_blank">https://github.com/mikelove/tximport</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38625/croco-a-program-to-detect-potential-cross-contaminations-in-hts-assembled-transcriptomes-using-expression-level-quantification</guid>
	<pubDate>Mon, 07 Jan 2019 18:17:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38625/croco-a-program-to-detect-potential-cross-contaminations-in-hts-assembled-transcriptomes-using-expression-level-quantification</link>
	<title><![CDATA[CroCo: A program to detect potential cross contaminations in HTS assembled transcriptomes using expression level quantification]]></title>
	<description><![CDATA[<p>CroCo is a program to detect cross contamination events in assembled transcriptomes using sequencing reads to determine the true origin of every transcripts.<br>Such cross contaminations can be expected if several RNA-Seq experiments were prepared during the same period at the same lab, or by the same people, or if they were processed or sequenced by the same sequencing service facility.<br>Our approach first determines a subset of transcripts that are suspiciously similar across samples using a pairwise BLAST procedure. CroCo then combine all transcriptomes into a metatranscriptome and quantifies the "expression level" of all transcripts successively using every sample read data (e.g. several species sequenced by the same lab for a particular study) while allowing read multi-mappings.<br>Several mapping tools implemented in CroCo can be used to estimate expression level (default is RapMap).<br>This information is then used to categorize each transcript in the following 5 categories :</p>
<p><br>clean: the transcript origin is from the focal sample.</p>
<p>cross contamination: the transcript origin is from an alien sample of the same experiment.</p>
<p>dubious: expression levels are too close between focal and alien samples to determine the true origin of the transcript.</p>
<p>low coverage: expression levels are too low in all samples, thus hampering our procedure (which relies on differential expression) to confidently assign it to any category.</p>
<p>over expressed: expression levels are very high in at least 3 samples and CroCo will not try to categorize it. Indeed, such a pattern does not correspond to expectations for cross contaminations, but often reflect highly conserved genes such as ribosomal gene, or external contamination shared by several samples (e.g. Escherichia coli contaminations).</p><p>Address of the bookmark: <a href="https://gitlab.mbb.univ-montp2.fr/mbb/CroCo" rel="nofollow">https://gitlab.mbb.univ-montp2.fr/mbb/CroCo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43550/basic-structure-of-snakemake-pipeline-run</guid>
	<pubDate>Thu, 14 Oct 2021 07:01:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43550/basic-structure-of-snakemake-pipeline-run</link>
	<title><![CDATA[Basic Structure of Snakemake Pipeline Run !]]></title>
	<description><![CDATA[<div>/user/snakemake-demo$ ls</div><div>config.json data envs scripts slurm-240702.out Snakefile</div><ul>
<li>data = mock data for the snakefile to use</li>
<li>Snakefile = name of the snakemake &ldquo;formula&rdquo; file
<ul>
<li>Note: The default file that snakemake looks for in the current working directory is the&nbsp;<code>Snakefile</code>. If you would like to override that you can specify it following the&nbsp;<code>-s</code>
<ul>
<li><code>snakemake -s snakefile.py</code></li>
</ul>
</li>
</ul>
</li>
<li>envs = directory for storing the conda environments that the workflow will use.</li>
<li>scripts = directory for storing python scripts called by the snakemake formula.</li>
<li>config.json = json format file with extra parameters for our snakemake file to use.</li>
<li>cluster.json = json format file with specification for running on the HPC</li>
<li>samples.txt = file we will use later relating to the config.json file.</li>
</ul><p><span>Run the snakemake file as a dry run (the example workflow shown above).</span></p><ul>
<li>This will build a DAG of the jobs to be run without actually executing them.</li>
<li><code>snakemake --dry-run</code></li>
</ul><p>User can e<span>xecute rules of interest.</span></p><ul>
<li><code>snakemake --dry-run all</code>&nbsp;VS.&nbsp;<code>snakemake --dry-run call</code>&nbsp;VS.&nbsp;<code>snakemake --dry-run bwa</code></li>
</ul><p><span>Run the snakemake file in order to produce an image of the DAG of jobs to be run.</span></p><ul>
<li><code>snakemake --dag | dot -Tsvg &gt; dag.svg</code>&nbsp;OR&nbsp;<code>snakemake --dag | dot -Tsvg &gt; dag.svg</code></li>
</ul><p>Run the snakemake (this time not as a dry run)</p><ol>
<li><code>snakemake --use-conda</code></li>
</ol>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</guid>
	<pubDate>Wed, 23 Mar 2016 05:53:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26752/rna-seq-de-novo-assembly-using-trinity</link>
	<title><![CDATA[RNA-Seq De novo Assembly Using Trinity]]></title>
	<description><![CDATA[<p>Trinity, developed at the <a href="http://www.broadinstitute.org">Broad Institute</a> and the <a href="http://www.cs.huji.ac.il">Hebrew University of Jerusalem</a>, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:</p>
<ul>
<li>
<p><em>Inchworm</em> assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.</p>
</li>
<li>
<p><em>Chrysalis</em> clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.</p>
</li>
<li>
<p><em>Butterfly</em> then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.</p>
</li>
</ul>
<p>More at https://github.com/trinityrnaseq/trinityrnaseq/wiki</p>
<p>......................................................................................................................................</p>
<p>Download Trinity <a href="https://github.com/trinityrnaseq/trinityrnaseq/releases">here</a>.</p>
<p>Build Trinity by typing 'make' in the base installation directory.</p>
<p>Assemble RNA-Seq data like so:</p>
<pre><code> Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G 
</code></pre>
<p>Find assembled transcripts as: 'trinity_out_dir/Trinity.fasta'</p><p>Address of the bookmark: <a href="https://github.com/trinityrnaseq/trinityrnaseq/wiki" rel="nofollow">https://github.com/trinityrnaseq/trinityrnaseq/wiki</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37545/ncbi-magic-blast</guid>
	<pubDate>Tue, 14 Aug 2018 18:11:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37545/ncbi-magic-blast</link>
	<title><![CDATA[NCBI Magic-BLAST]]></title>
	<description><![CDATA[<p>Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.</p>
<p>Magic-BLAST incorporates within the NCBI BLAST code framework ideas developed in the NCBI Magic pipeline, in particular hit extensions by local walk and jump&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pubmed/26109056">(http://www.ncbi.nlm.nih.gov/pubmed/26109056)</a>, and recursive clipping of mismatches near the edges of the reads, which avoids accumulating artefactual mismatches near splice sites and is needed to distinguish short indels from substitutions near the edges.</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://ncbi.github.io/magicblast/" rel="nofollow">https://ncbi.github.io/magicblast/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43384/lncpipea-nextflow-based-pipeline-for-comprehensive-analyses-of-long-non-coding-rnas-from-rna-seq-datasets</guid>
	<pubDate>Fri, 17 Sep 2021 01:57:02 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43384/lncpipea-nextflow-based-pipeline-for-comprehensive-analyses-of-long-non-coding-rnas-from-rna-seq-datasets</link>
	<title><![CDATA[LncPipe:A Nextflow-based pipeline for comprehensive analyses of long non-coding RNAs from RNA-seq datasets]]></title>
	<description><![CDATA[<p><span>The pipeline was developed based on a popular workflow framework&nbsp;</span><a href="https://github.com/nextflow-io/nextflow">Nextflow</a><span>, composed of four core procedures including reads alignment, assembly, identification and quantification. It contains various unique features such as well-designed lncRNAs annotation strategy, optimized calculating efficiency, diversified classification and interactive analysis report.&nbsp;</span><a href="https://github.com/likelet/LncPipe">LncPipe</a><span>&nbsp;allows users additional control in interuppting the pipeline, resetting parameters from command line, modifying main script directly and resume analysis from previous checkpoint.</span></p>
<p>Ref&nbsp;https://www.lncrnablog.com/lncpipe-a-nextflow-based-pipeline-for-identification-and-analysis-of-long-non-coding-rnas-from-rna-seq-data/</p>
<p><img src="https://ars.els-cdn.com/content/image/1-s2.0-S1673852718301176-gr1.jpg" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://github.com/likelet/LncPipe" rel="nofollow">https://github.com/likelet/LncPipe</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/17843/pathway-analysis</guid>
	<pubDate>Fri, 03 Oct 2014 08:51:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/17843/pathway-analysis</link>
	<title><![CDATA[Pathway Analysis]]></title>
	<description><![CDATA[<p>Pathway Analysis is usually performed with aim to enrich the genes with their functional information and reveal the underlying biological mechanisms pursue by genes. Pathway Analysis is not only limited to what biological pathways a particular set of expressed genes follow but also to disclose the relationships between these genes. With availability of more genomics, transcriptomics and proteomics data, interactions between genes involve in multiple pathways become more clear and also relationships between the genes, their transcripts, and their gene products. However, existing tools and dbs mainly based on knowledge driven approach in which pathways will be identified by finding the correlation between the&nbsp;<span>information in one of the pathway knowledge databases (KEGG,Reactome,Panther,BioCarta, Panther,GO,NCI,WikiPathways,etc) and gene expression result for a specific conditions for instance tumor, obesity , cold resistant crops/plants, etc.</span></p><p><span><strong>Introductory Articles/ppt/sources</strong>:</span></p><p><a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002375"><span>http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002375</span></a></p><p><a href="http://bioinformatics.mdanderson.org/MicroarrayCourse/Lectures09/Pathway%20Analysis.pdf"><span>http://bioinformatics.mdanderson.org/MicroarrayCourse/Lectures09/Pathway%20Analysis.pdf</span></a></p><p><a href="http://gettinggeneticsdone.blogspot.de/2012/03/pathway-analysis-for-high-throughput.html"><span>http://gettinggeneticsdone.blogspot.de/2012/03/pathway-analysis-for-high-throughput.html</span></a></p><p><a href="http://davetang.org/muse/tag/pathway/"><span>http://davetang.org/muse/tag/pathway/</span></a></p><p><a href="https://www.biostars.org/p/42219/"><span>https://www.biostars.org/p/42219/</span></a></p><p><a href="http://bioinformatics.ca//files/public/Pathways_2014_Module4_v2.pdf"><span>http://bioinformatics.ca//files/public/Pathways_2014_Module4_v2.pdf</span></a></p><p><a href="http://bioinformatics.ca//files/public/Pathways_2014_Module2.pdf"><span>http://bioinformatics.ca//files/public/Pathways_2014_Module2.pdf</span></a></p><p><span><strong>Impotant Database and Tools</strong>:</span></p><p>GeneMANIA, Cytoscape,&nbsp;<a href="http://www.ingenuity.com/products/ipa">IPA</a>&nbsp;and <a href="http://thomsonreuters.com/metacore/">Metacore</a> (Commerical ),&nbsp;<span>Pathway Commons, Reactome ,Panther, BioCyc, WikiPathways, Pathvisio, KEGG, NCI, Stringdb, Amigo,&nbsp;<span>WebGestalt ,<span>ConsensusPathDB ,GSEA,Blast2go</span></span></span></p><p><span><strong>Popular R based tools</strong>:</span></p><p><span>Reactome.db, ReactomePA, ClusterProfiler, Gage, SPIA, topGO, Pathview,DOSE,GOStat</span></p><p><span><strong>More</strong>:</span></p><p><a href="http://www.bioconductor.org/help/search/index.html?q=Enrichment+analysis+"><span>http://www.bioconductor.org/help/search/index.html?q=Enrichment+analysis+</span></a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>