<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44724?</link>
	<atom:link href="https://bioinformaticsonline.com/related/44724?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29004/r-chie</guid>
	<pubDate>Thu, 01 Sep 2016 11:47:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29004/r-chie</link>
	<title><![CDATA[R-chie]]></title>
	<description><![CDATA[<p><strong>R-chie</strong><span>&nbsp;allows you to make arc diagrams of RNA secondary structures, allowing for easy comparison and overlap of two structures, rank and display basepairs in colour and to also visualize corresponding multiple sequence alignments and co-variation information.</span><br><strong>R4RNA</strong><span>&nbsp;is the R package powering R-chie, available for&nbsp;</span><a href="http://www.e-rna.org/r-chie/download.cgi">download</a><span>&nbsp;and local use for more customized figures and scripting.</span></p>
<p>http://www.e-rna.org/r-chie/plot.cgi?eg=single</p><p>Address of the bookmark: <a href="http://www.e-rna.org/r-chie/plot.cgi?eg=single" rel="nofollow">http://www.e-rna.org/r-chie/plot.cgi?eg=single</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29487/shinyheatmap</guid>
	<pubDate>Fri, 21 Oct 2016 05:12:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29487/shinyheatmap</link>
	<title><![CDATA[Shinyheatmap]]></title>
	<description><![CDATA[<p><span>Background: Transcriptomics, metabolomics, metagenomics, and other various next-generation sequencing (-omics) fields are known for their production of large datasets. Visualizing such big data has posed technical challenges in biology, both in terms of available computational resources as well as programming acumen. Since heatmaps are used to depict high-dimensional numerical data as a colored grid of cells, efficiency and speed have often proven to be critical considerations in the process of successfully converting data into graphics. For example, rendering interactive heatmaps from large input datasets (e.g., 100k+ rows) has been computationally infeasible on both desktop computers and web browsers. In addition to memory requirements, programming skills and knowledge have frequently been barriers-to-entry for creating highly customizable heatmaps. Results: We propose shinyheatmap: an advanced user-friendly heatmap software suite capable of efficiently creating highly customizable static and interactive biological heatmaps in a web browser. shinyheatmap is a low memory footprint program, making it particularly well-suited for the interactive visualization of extremely large datasets that cannot typically be computed in-memory due to size restrictions. Conclusions: shinyheatmap is hosted online as a freely available web server with an intuitive graphical user interface: http://shinyheatmap.com. The methods are implemented in R, and are available as part of the shinyheatmap project at: https://github.com/Bohdan-Khomtchouk/shinyheatmap.</span></p>
<p><span>More at&nbsp;http://biorxiv.org/content/early/2016/09/21/076463&nbsp;</span></p><p>Address of the bookmark: <a href="http://shinyheatmap.com/" rel="nofollow">http://shinyheatmap.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40948/bio7-an-integrated-development-environment-for-ecological-modeling-scientific-image-analysis-and-statistical-analysis</guid>
	<pubDate>Fri, 07 Feb 2020 23:32:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40948/bio7-an-integrated-development-environment-for-ecological-modeling-scientific-image-analysis-and-statistical-analysis</link>
	<title><![CDATA[Bio7: an integrated development environment for ecological modeling, scientific image analysis and statistical analysis]]></title>
	<description><![CDATA[<p><span>The application Bio7 is an integrated development environment for ecological modeling, scientific image analysis and statistical analysis. The application itself is based on an RCP-Eclipse-Environment (Rich-Client-Platform) which offers a huge flexibility in configuration and extensibility because of its plug-in structure and the possibility of customization.</span></p>
<p><a href="https://bio7.org/about/">https://bio7.org/about/</a></p><p>Address of the bookmark: <a href="https://bio7.org/home-2/" rel="nofollow">https://bio7.org/home-2/</a></p>]]></description>
	<dc:creator>Nidhi Rajput</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17176/arvados</guid>
	<pubDate>Sat, 20 Sep 2014 16:54:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17176/arvados</link>
	<title><![CDATA[Arvados]]></title>
	<description><![CDATA[<p>Arvados is a free and open&nbsp;source bioinformatics&nbsp;platform for genomic and&nbsp;biomedical data. User can&nbsp;Store | Organize | Compute | Share the data for free.&nbsp;</p>
<p><img src="https://arvados.org/images/dax.png" width="400" height="535" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://arvados.org/" rel="nofollow">https://arvados.org/</a></p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36730/bprna-large-scale-automated-annotation-and-analysis-of-rna-secondary-structure</guid>
	<pubDate>Wed, 23 May 2018 03:24:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36730/bprna-large-scale-automated-annotation-and-analysis-of-rna-secondary-structure</link>
	<title><![CDATA[bpRNA: large-scale automated annotation and analysis of RNA secondary structure]]></title>
	<description><![CDATA[<p>bpRNA, a novel annotation tool capable of parsing RNA structures, including complex pseudoknot-containing RNAs, to yield an objective, precise, compact, unambiguous, easily-interpretable description of all loops, stems, and pseudoknots, along with the positions, sequence, and flanking base pairs of each such structural feature.</p>
<p>The bpRNA code is written in perl and requires the Graph perl module. Several additional scripts for analysis are included. The source code is available at http://github.com/hendrixlab/bpRNA.</p><p>Address of the bookmark: <a href="http://github.com/hendrixlab/bpRNA" rel="nofollow">http://github.com/hendrixlab/bpRNA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44705/pirna-and-bioinformatics-decoding-the-guardians-of-the-genome</guid>
	<pubDate>Sat, 07 Dec 2024 02:15:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44705/pirna-and-bioinformatics-decoding-the-guardians-of-the-genome</link>
	<title><![CDATA[piRNA and Bioinformatics: Decoding the Guardians of the Genome]]></title>
	<description><![CDATA[<p>In the symphony of small RNAs, PIWI-interacting RNAs (piRNAs) stand out as the protectors of genomic integrity. These small, non-coding RNAs play critical roles in silencing transposable elements, regulating gene expression, and maintaining germline stability. The rise of bioinformatics has revolutionized our understanding of piRNAs, enabling researchers to decipher their biogenesis, functions, and evolutionary significance.</p><h3>What Are piRNAs?</h3><p>piRNAs are the largest class of small non-coding RNAs, typically 24&ndash;32 nucleotides in length. Unlike microRNAs (miRNAs) and small interfering RNAs (siRNAs), piRNAs do not rely on Dicer enzymes for maturation. Instead, they are processed from long single-stranded precursors and associate with PIWI proteins, a subclass of the Argonaute protein family.</p><p>The primary functions of piRNAs include:</p><ol>
<li><strong>Silencing Transposable Elements</strong>: By targeting transposons, piRNAs prevent genomic instability, particularly in germline cells.</li>
<li><strong>Regulating Gene Expression</strong>: piRNAs modulate gene expression at transcriptional and post-transcriptional levels.</li>
<li><strong>Epigenetic Modulation</strong>: They guide epigenetic modifications, such as DNA methylation, to specific genomic loci.</li>
</ol><h3>Challenges in piRNA Research</h3><p>Studying piRNAs is fraught with challenges, including:</p><ul>
<li><strong>Short Length</strong>: Their small size complicates sequencing and alignment.</li>
<li><strong>Lack of Sequence Conservation</strong>: Unlike miRNAs, piRNAs exhibit limited sequence conservation across species.</li>
<li><strong>Complex Biogenesis</strong>: The intricate pathways of piRNA generation require sophisticated computational tools to unravel.</li>
</ul><h3>Bioinformatics: Illuminating the World of piRNAs</h3><p>Bioinformatics has emerged as an indispensable tool for studying piRNAs, facilitating their discovery, annotation, and functional analysis. Here's how bioinformatics is transforming piRNA research:</p><h4>1. <strong>Identification and Annotation</strong></h4><p>The discovery of piRNAs relies on next-generation sequencing (NGS) data. Bioinformatics tools such as <em>piRNApredictor</em> and <em>Piano</em> identify piRNA clusters and predict potential targets. Databases like piRBase and piRNAdb curate information about known piRNAs, their sequences, and associated proteins.</p><h4>2. <strong>Mapping and Alignment</strong></h4><p>piRNAs often originate from repetitive regions, making their alignment challenging. Tools like Bowtie and STAR handle the unique mapping requirements of piRNAs, enabling accurate identification of piRNA clusters in genomes.</p><h4>3. <strong>Functional Analysis</strong></h4><p>Bioinformatics approaches predict piRNA functions by analyzing their interactions with transposons, genes, and epigenetic marks. Algorithms such as TargetFinder and RIblast explore piRNA-mRNA interactions, shedding light on regulatory networks.</p><h4>4. <strong>Evolutionary Studies</strong></h4><p>piRNAs are evolutionarily diverse, reflecting their roles in species-specific genomic defense. Comparative genomics tools help trace the evolution of piRNA clusters and their associated PIWI proteins across species.</p><h4>5. <strong>Epigenomic Insights</strong></h4><p>piRNAs are key players in epigenetic regulation. Bioinformatics pipelines integrate piRNA data with chromatin immunoprecipitation sequencing (ChIP-seq) and DNA methylation data to uncover their role in shaping the epigenome.</p><h3>Case Study: piRNAs in Germline Integrity</h3><p>One of the hallmark functions of piRNAs is the suppression of transposable elements in the germline. For example, in <em>Drosophila melanogaster</em>, piRNAs target retrotransposons like <em>gypsy</em> and <em>copia</em>. Bioinformatics analyses revealed that these piRNAs guide PIWI proteins to transposon-derived RNA, ensuring genome stability during gametogenesis.</p><h3>Clinical Relevance of piRNAs</h3><p>Recent studies suggest that piRNAs may serve as biomarkers for diseases such as cancer, infertility, and neurodegenerative disorders. For instance:</p><ul>
<li><strong>Cancer</strong>: Dysregulated piRNA expression has been linked to tumorigenesis, making them potential targets for cancer therapies.</li>
<li><strong>Infertility</strong>: Aberrant piRNA pathways are implicated in male infertility due to their role in spermatogenesis.</li>
<li><strong>Neurodegeneration</strong>: piRNAs may regulate neuronal gene expression, highlighting their potential in neurological research.</li>
</ul><h3>Future Directions</h3><p>The integration of bioinformatics with emerging technologies offers exciting opportunities for piRNA research:</p><ul>
<li><strong>Single-Cell Sequencing</strong>: Unveiling cell-specific piRNA expression and function.</li>
<li><strong>Machine Learning</strong>: Predicting piRNA functions and targets with greater accuracy.</li>
<li><strong>CRISPR-Based Tools</strong>: Editing piRNA clusters to explore their roles in vivo.</li>
</ul><h3>Conclusion</h3><p>piRNAs are the unsung guardians of the genome, safeguarding genetic material from transposable elements and contributing to gene regulation and epigenetic programming. Bioinformatics has opened the floodgates of discovery, unraveling the complexities of piRNAs and their myriad roles in biology and disease.</p><p>As we continue to decode the piRNA landscape, these small RNAs promise to unveil big secrets about genome stability, evolution, and human health, cementing their place as a fascinating frontier in molecular biology.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44626/meta-transcriptomics-dynamic-world-of-rna-in-diverse-environments</guid>
	<pubDate>Wed, 31 Jul 2024 02:40:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44626/meta-transcriptomics-dynamic-world-of-rna-in-diverse-environments</link>
	<title><![CDATA[Meta-Transcriptomics: Dynamic World of RNA in Diverse Environments]]></title>
	<description><![CDATA[<p>Meta-transcriptomics combines high-throughput sequencing technologies with computational biology to profile the RNA content of a sample. This technique allows researchers to capture a snapshot of gene expression and metabolic activities across diverse microbial communities, such as those found in soil, water, and the human gut.</p><p><strong>Key Components</strong></p><ol>
<li>
<p><strong>Sample Collection</strong>: Meta-transcriptomics begins with the collection of environmental samples. These samples are often complex, containing a wide range of microorganisms.</p>
</li>
<li>
<p><strong>RNA Extraction</strong>: RNA is extracted from the sample, which includes mRNA, rRNA, tRNA, and other non-coding RNAs. This step is crucial as it determines the quality and representativeness of the data.</p>
</li>
<li>
<p><strong>Sequencing</strong>: High-throughput RNA sequencing (RNA-seq) technologies are used to obtain sequences of the RNA transcripts. This step provides a vast amount of data on the RNA molecules present in the sample.</p>
</li>
<li>
<p><strong>Data Analysis</strong>: Computational tools and bioinformatics methods are employed to process and analyze the sequencing data. This involves mapping RNA sequences to reference genomes or transcriptomes, identifying expressed genes, and quantifying their abundance.</p>
</li>
<li>
<p><strong>Functional Annotation</strong>: The functional roles of identified transcripts are inferred based on known gene functions, allowing researchers to understand the metabolic and ecological functions of the microbial community.</p>
</li>
</ol><p><strong>Applications</strong></p><ol>
<li>
<p><strong>Environmental Monitoring</strong>: Meta-transcriptomics can be used to monitor the health and functional status of ecosystems. For example, it can help assess the impact of pollution on microbial communities by revealing changes in gene expression related to stress response and degradation processes.</p>
</li>
<li>
<p><strong>Microbiome Research</strong>: In human health, meta-transcriptomics offers insights into the gut microbiome&rsquo;s functional state. It helps in understanding how microbial communities interact with their host, how they respond to dietary changes, and their role in health and disease.</p>
</li>
<li>
<p><strong>Biotechnology</strong>: The technique can aid in the discovery of novel enzymes and bioactive compounds by profiling microbial communities in extreme environments or industrial processes.</p>
</li>
<li>
<p><strong>Disease Pathogenesis</strong>: By analyzing RNA profiles from disease-associated environments, researchers can uncover pathogen-host interactions and identify potential targets for therapeutic interventions.</p>
</li>
</ol><p><strong>Challenges</strong></p><ol>
<li>
<p><strong>Complexity of Data</strong>: The sheer volume and complexity of data generated by meta-transcriptomics can be overwhelming. Effective data management and advanced computational tools are required to extract meaningful insights.</p>
</li>
<li>
<p><strong>Sampling Bias</strong>: Environmental samples can be heterogeneous, and RNA extraction methods may introduce biases, potentially affecting the accuracy of the results.</p>
</li>
<li>
<p><strong>Reference Databases</strong>: Incomplete or biased reference databases can hinder the accurate functional annotation of transcripts, especially when studying novel or poorly characterized organisms.</p>
</li>
</ol><p><strong>Future Directions</strong></p><p>Meta-transcriptomics is a rapidly evolving field, with ongoing advancements in sequencing technologies and bioinformatics. Future research may focus on improving data integration, developing more comprehensive reference databases, and enhancing our understanding of microbial community dynamics in various environments. As these challenges are addressed, meta-transcriptomics will continue to provide valuable insights into the functional roles of microorganisms and their interactions within ecosystems.</p><p><strong>Conclusion</strong></p><p>Meta-transcriptomics represents a powerful tool for exploring the functional aspects of microbial communities in their natural environments. By capturing a snapshot of gene expression and metabolic activities, this approach offers a deeper understanding of ecological interactions, health implications, and biotechnological potentials. As technology and methodologies advance, meta-transcriptomics is poised to make significant contributions to our knowledge of the microbial world.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/41230/curated-set-of-ribosomal-rna-rrna-reference-sequences-targeted-loci-with-verifiable-organism</guid>
	<pubDate>Sun, 23 Feb 2020 02:17:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/41230/curated-set-of-ribosomal-rna-rrna-reference-sequences-targeted-loci-with-verifiable-organism</link>
	<title><![CDATA[Curated set of ribosomal RNA (rRNA) reference sequences (targeted loci) with verifiable organism]]></title>
	<description><![CDATA[<p>MCBI have a curated set of ribosomal RNA (rRNA) reference sequences (targeted loci) with verifiable organism sources and current names. This set is critical for correctly identifying and classifying prokaryotic (bacteria and archaea) and fungal samples. To provide easy access to these sequences, we recently added a separate rRNA/ITS databases section on the nucleotide BLAST page for these targeted sequences that makes it convenient to quickly identify source organisms. The new databases are: </p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *16S ribosomal RNA (Bacteria and Archaea)</p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *18S ribosomal RNA sequences (SSU) from Fungi type and reference material&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *28S ribosomal RNA sequences (LSU) from Fungi type and reference material</p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *Internal transcribed spacer region (ITS) from Fungi type and reference material</p><p>You can also download these from the BLAST db FTP area.&nbsp; See the <a href="https://go.usa.gov/xdEBX" target="_blank">NCBI Insights post</a> for more detail. </p><p>Useful links</p><p>-----------------</p><p><a href="https://go.usa.gov/xdEj5" target="_blank">BLAST form with rRNA/ITS databases</a></p><p><a href="https://ftp.ncbi.nlm.nih.gov/blast/db/" target="_blank">BLAST db download</a></p><p><a href="https://www.ncbi.nlm.nih.gov/refseq/targetedloci/" target="_blank">Targeted loci</a></p><p><span style="color: black;">If you have any questions or concerns, please contact <a href="mailto:blast-help@ncbi.nlm.nih.gov" target="_blank" title="Follow link">blast-help@ncbi.nlm.nih.gov<sup><span style="color: black; text-decoration: none;"><img src="https://mail.google.com/mail/u/0?ui=2&amp;ik=024a8aa0b9&amp;attid=0.1&amp;permmsgid=msg-f:1659255165855446848&amp;th=1706dbc8408bb740&amp;view=fimg&amp;sz=s0-l75-ft&amp;attbid=ANGjdJ_drW2ArYDNLoHrQh36gm6rp2Std8ZUSplCzP6bYQSQYBsQfZ_85vOujXOdTRdaLxrR7QeEBVUbyACPBJHhFUeIglX8G7Ew7TcclzhvO7fJhiz7sIdkkDgZ7QA&amp;disp=emb" alt="https://jira.ncbi.nlm.nih.gov/images/icons/mail_small.gif" width="13" height="12" style="border: 0px;"></span></sup></a></span></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44713/understanding-rna-seq-normalization-methods-tpm-vs-fpkm-vs-cpm</guid>
	<pubDate>Wed, 11 Dec 2024 00:59:15 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44713/understanding-rna-seq-normalization-methods-tpm-vs-fpkm-vs-cpm</link>
	<title><![CDATA[Understanding RNA-Seq Normalization Methods: TPM vs. FPKM vs. CPM]]></title>
	<description><![CDATA[<p>RNA sequencing (RNA-Seq) is a powerful technology used to study transcriptomes, providing insights into gene expression levels. However, raw RNA-Seq data requires normalization to account for sequencing depth and gene length, enabling accurate comparisons between genes and samples. Among the most widely used normalization methods are TPM (Transcripts Per Million), FPKM (Fragments Per Kilobase Million), and CPM (Counts Per Million). Each method has its unique principles and applications, which we&rsquo;ll explore in this blog.</p><h2>Why Normalize RNA-Seq Data?</h2><p>Normalization is a crucial step in RNA-Seq analysis for the following reasons:</p><ul>
<li>
<p><strong>Sequencing depth:</strong> Different RNA-Seq experiments produce varying numbers of reads, making direct comparisons between samples misleading.</p>
</li>
<li>
<p><strong>Gene length:</strong> Longer genes inherently generate more reads, irrespective of their actual expression level.</p>
</li>
<li>
<p><strong>Bias reduction:</strong> Normalization mitigates technical biases, enabling meaningful biological interpretation.</p>
</li>
</ul><h2>TPM (Transcripts Per Million)</h2><p>TPM measures the proportion of reads mapped to a transcript, normalized by transcript length and sequencing depth. It is calculated as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Proportionality:</strong> TPM values sum to 1,000,000 across all transcripts in a sample, making it easier to compare between samples.</p>
</li>
<li>
<p><strong>Intuitive interpretation:</strong> TPM values directly represent the abundance of transcripts in a sample.</p>
</li>
<li>
<p><strong>Preferred for comparisons:</strong> TPM facilitates between-sample comparisons better than FPKM.</p>
</li>
</ol><h2>FPKM (Fragments Per Kilobase Million)</h2><p>FPKM normalizes read counts by transcript length and sequencing depth, but without enforcing proportionality like TPM. It is defined as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Historical significance:</strong> FPKM was one of the first normalization methods used for RNA-Seq.</p>
</li>
<li>
<p><strong>Single-end vs. paired-end:</strong> In paired-end sequencing, FPKM becomes RPKM (Reads Per Kilobase Million).</p>
</li>
<li>
<p><strong>Limited utility:</strong> FPKM values are not as robust as TPM for cross-sample comparisons due to lack of proportionality.</p>
</li>
</ol><h2>CPM (Counts Per Million)</h2><p>CPM normalizes raw read counts by sequencing depth, without considering gene length. It is expressed as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Simplicity:</strong> CPM is straightforward and computationally less intensive.</p>
</li>
<li>
<p><strong>Application:</strong> Suitable for non-length-dependent analyses, such as comparing total expression levels or differential expression analysis.</p>
</li>
<li>
<p><strong>Gene length agnostic:</strong> CPM does not correct for gene length, making it less ideal for measuring expression levels.</p>
</li>
</ol><h2>When to Use Each Method</h2><ul>
<li>
<p><strong>TPM:</strong> Best for comparing expression levels between samples, especially when transcript length and sequencing depth vary.</p>
</li>
<li>
<p><strong>FPKM:</strong> Useful for historical consistency but generally replaced by TPM.</p>
</li>
<li>
<p><strong>CPM:</strong> Ideal for differential expression analysis when gene length normalization is unnecessary.</p>
</li>
</ul><h2>Conclusion</h2><p>Choosing the right normalization method depends on the specific objectives of your RNA-Seq analysis. TPM&rsquo;s proportionality and robustness make it the preferred choice for most applications, while CPM serves well for differential expression studies. Although FPKM paved the way for RNA-Seq normalization, it has largely been supplanted by TPM in modern workflows. Understanding these methods and their nuances ensures accurate and meaningful interpretations of RNA-Seq data.</p><h3>References:</h3><ol>
<li>
<p>Li, B., &amp; Dewey, C. N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. <em>BMC Bioinformatics.</em></p>
</li>
<li>
<p>Trapnell, C., et al. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. <em>Nature Biotechnology.</em></p>
</li>
<li>
<p>Law, C. W., et al. (2014). voom: precision weights unlock linear model analysis tools for RNA-seq read counts. <em>Genome Biology.</em></p>
</li>
</ol>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>