<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43088?offset=420</link>
	<atom:link href="https://bioinformaticsonline.com/related/43088?offset=420" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44371/steps-to-find-all-the-repeats-in-the-genome</guid>
	<pubDate>Thu, 31 Aug 2023 02:43:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44371/steps-to-find-all-the-repeats-in-the-genome</link>
	<title><![CDATA[Steps to find all the repeats in the genome !]]></title>
	<description><![CDATA[<div><p>To find repeats in a genome from 2 to 9 length using a Perl script, you can use the RepeatMasker tool with the "--length" option<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>. Here's a step-by-step guide:</p></div><div><ol>
<li>Install RepeatMasker: First, you need to install RepeatMasker on your system. You can download it from the RepeatMasker website<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>.</li>
</ol></div><div><ol>
<li>Prepare the genome sequence: Make sure you have the genome sequence in a FASTA file format. Let's assume the file is named "genome.fasta".</li>
</ol><blockquote><p>./RepeatMasker -pa &lt;number_of_processors&gt; -nolow -norna -no_is -div &lt;divergence_value&gt; -lib RepeatMaskerLib.embl -gff -xsmall -small -poly -species &lt;species_name&gt; -dir &lt;output_directory&gt; -length &lt;min_length&gt;-&lt;max_length&gt; genome.fasta</p></blockquote><div><p>Replace the following placeholders with appropriate values:</p><ul>
<li><code>&lt;number_of_processors&gt;</code>: The number of processors/threads you want to use for parallel processing.</li>
<li><code>&lt;divergence_value&gt;</code>: The divergence value for the species you are analyzing. You can find divergence values for different species in the RepeatMasker documentation<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>.</li>
<li><code>&lt;species_name&gt;</code>: The name of the species you are analyzing.</li>
<li><code>&lt;output_directory&gt;</code>: The directory where you want the output files to be saved.</li>
<li><code>&lt;min_length&gt;</code>&nbsp;and&nbsp;<code>&lt;max_length&gt;</code>: The minimum and maximum lengths of the repeats you want to find (in this case, 2 and 9).</li>
</ul></div><div><ol>
<li>Analyze the output: RepeatMasker will generate several output files, including a .out file. You can parse this file to extract the information you need. There is a Perl tool called "one_code_to_find_them_all.pl" that can help you parse RepeatMasker output files<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>. You can download it from the source provided.</li>
</ol></div><div><ol>
<li>Use the provided Perl script: Once you have the "one_code_to_find_them_all.pl" script, you can run it to conveniently parse the RepeatMasker output files. Here's an example of how to use it:</li>
</ol><blockquote><p>perl one_code_to_find_them_all.pl --rm &lt;RepeatMasker_out_file&gt; --length &lt;length_file&gt;</p></blockquote></div><p>&nbsp;</p></div><div><div><p>Replace&nbsp;<code>&lt;RepeatMasker_out_file&gt;</code>&nbsp;with the path to your RepeatMasker .out file, and&nbsp;<code>&lt;length_file&gt;</code>&nbsp;with the path to a file containing the lengths of the reference elements.</p></div><div><p>This script will generate several output files, including .log.txt and .copynumber.csv, which contain quantitative information about the identified repeat elements.</p></div><div><p>Remember to adjust the parameters and options according to your specific needs and the characteristics of your genome.</p></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44766/genome-simulation-with-slim-and-msprime</guid>
	<pubDate>Fri, 31 Jan 2025 12:47:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44766/genome-simulation-with-slim-and-msprime</link>
	<title><![CDATA[Genome Simulation with SLiM and msprime]]></title>
	<description><![CDATA[<p>Genome simulation is an essential tool in population genetics, enabling researchers to model evolutionary processes and study genetic variation. Two widely used simulation tools in this field are <strong style="font-size: 12.8px;">SLiM</strong><span style="font-size: 12.8px; font-weight: normal;"> and </span><strong style="font-size: 12.8px;">msprime</strong><span style="font-size: 12.8px; font-weight: normal;">. While both serve different purposes, they can be used together with the </span><strong style="font-size: 12.8px;">slendr</strong><span style="font-size: 12.8px; font-weight: normal;"> framework to compare simulation outputs effectively.</span></p><h2>Overview of SLiM and msprime</h2><h3>SLiM: Forward Genetic Simulator</h3><p>SLiM is a <strong>free, open-source</strong> tool designed for forward genetic simulations. It allows researchers to model complex evolutionary scenarios, including selection, recombination, and demographic events, making it particularly useful for studying adaptation and selection in populations.</p><p><strong>Key Features of SLiM:</strong></p><ul>
<li>
<p>Simulates population evolution forward in time</p>
</li>
<li>
<p>Supports custom evolutionary models using an embedded scripting language</p>
</li>
<li>
<p>Allows modeling of spatial and ecological dynamics</p>
</li>
<li>
<p>Provides high flexibility and extensibility for user-defined scenarios</p>
</li>
<li>
<p>Available on GitHub as an open-source project</p>
</li>
</ul><h3>msprime: Ancestry and Mutation Simulator</h3><p>msprime is an efficient, <strong>open-source</strong> tool that simulates ancestry and mutations using a coalescent framework. It is known for its high-speed performance and low memory requirements, making it a popular choice for large-scale genomic simulations.</p><p><strong>Key Features of msprime:</strong></p><ul>
<li>
<p>Implements coalescent simulations for ancestry modeling</p>
</li>
<li>
<p>Efficiently simulates large population histories</p>
</li>
<li>
<p>Supports the addition of mutations to genealogies</p>
</li>
<li>
<p>Developed using an open-source community model</p>
</li>
<li>
<p>Often faster and more memory-efficient than alternative simulators</p>
</li>
</ul><h2>Using SLiM and msprime with slendr</h2><p>Both SLiM and msprime can be integrated with <strong>slendr</strong>, a framework that facilitates structured population genetic simulations. This integration allows for seamless comparison of simulation outputs.</p><h3>How They Work Together:</h3><ul>
<li>
<p>SLiM and msprime simulations can be analyzed within slendr.</p>
</li>
<li>
<p>The <strong>ts_read()</strong> function in slendr enables loading and comparing tree sequence outputs from both simulators.</p>
</li>
<li>
<p>This integration allows researchers to validate simulation results and gain deeper insights into evolutionary processes.</p>
</li>
</ul><h2>Performance Considerations</h2><p>While SLiM offers powerful forward simulations with extensive customization, msprime is often preferred for its <strong>speed and memory efficiency</strong> when simulating ancestry and mutations. The choice between the two depends on the research goals:</p><ul>
<li>
<p><strong>For detailed evolutionary modeling with selection and recombination:</strong> Use SLiM.</p>
</li>
<li>
<p><strong>For large-scale coalescent simulations with mutations:</strong> Use msprime.</p>
</li>
<li>
<p><strong>For comparing different simulation models and their outputs:</strong> Use slendr to integrate SLiM and msprime results.</p>
</li>
</ul><h2>Conclusion</h2><p>SLiM and msprime are valuable tools for genome simulation, each serving distinct but complementary purposes in population genetics research. By leveraging the strengths of both simulators with slendr, researchers can conduct robust and efficient evolutionary simulations, enhancing our understanding of genetic diversity and adaptation.</p><p>For more information, check out the official GitHub repositories for <strong>SLiM</strong> and <strong>msprime</strong>, and explore the <strong>slendr</strong> framework for streamlined simulation workflow</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/1124/rolf-backofen-lab</guid>
  <pubDate>Thu, 18 Jul 2013 13:51:23 -0500</pubDate>
  <link></link>
  <title><![CDATA[Rolf Backofen Lab]]></title>
  <description><![CDATA[
<p>The research interest of this group include constraint programming, structure prediction in simplified protein models, investigation of protein energy landscapes, detection of RNA sequence/structure motifs, prediction and evaluation of alternative splice forms, description and detection of regulatory sequences.</p>

<p>Link @ http://www.bioinf.uni-freiburg.de/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/6720/rna-sequencing-helps-identify-functional-variants-from-gwas</guid>
	<pubDate>Fri, 22 Nov 2013 21:33:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/6720/rna-sequencing-helps-identify-functional-variants-from-gwas</link>
	<title><![CDATA[RNA Sequencing Helps Identify Functional Variants from GWAS]]></title>
	<description><![CDATA[<p><span>For Alzheimer&rsquo;s and other complex disorders, mining the genome for disease-associated variants is no longer the obstacle. The challenge nowadays is figuring out how the identified loci relate to disease. As reported last month in Nature and its associated journals, advances in high-throughput RNA sequencing are providing new tools for understanding how disease loci influence gene expression&mdash;a starting point for understanding their connection to pathogenesis.</span></p><p>Address of the bookmark: <a href="http://schizophreniaforum.org/new/detail.asp?id=1953" rel="nofollow">http://schizophreniaforum.org/new/detail.asp?id=1953</a></p>]]></description>
	<dc:creator>Andaleeb</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/17843/pathway-analysis</guid>
	<pubDate>Fri, 03 Oct 2014 08:51:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/17843/pathway-analysis</link>
	<title><![CDATA[Pathway Analysis]]></title>
	<description><![CDATA[<p>Pathway Analysis is usually performed with aim to enrich the genes with their functional information and reveal the underlying biological mechanisms pursue by genes. Pathway Analysis is not only limited to what biological pathways a particular set of expressed genes follow but also to disclose the relationships between these genes. With availability of more genomics, transcriptomics and proteomics data, interactions between genes involve in multiple pathways become more clear and also relationships between the genes, their transcripts, and their gene products. However, existing tools and dbs mainly based on knowledge driven approach in which pathways will be identified by finding the correlation between the&nbsp;<span>information in one of the pathway knowledge databases (KEGG,Reactome,Panther,BioCarta, Panther,GO,NCI,WikiPathways,etc) and gene expression result for a specific conditions for instance tumor, obesity , cold resistant crops/plants, etc.</span></p><p><span><strong>Introductory Articles/ppt/sources</strong>:</span></p><p><a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002375"><span>http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002375</span></a></p><p><a href="http://bioinformatics.mdanderson.org/MicroarrayCourse/Lectures09/Pathway%20Analysis.pdf"><span>http://bioinformatics.mdanderson.org/MicroarrayCourse/Lectures09/Pathway%20Analysis.pdf</span></a></p><p><a href="http://gettinggeneticsdone.blogspot.de/2012/03/pathway-analysis-for-high-throughput.html"><span>http://gettinggeneticsdone.blogspot.de/2012/03/pathway-analysis-for-high-throughput.html</span></a></p><p><a href="http://davetang.org/muse/tag/pathway/"><span>http://davetang.org/muse/tag/pathway/</span></a></p><p><a href="https://www.biostars.org/p/42219/"><span>https://www.biostars.org/p/42219/</span></a></p><p><a href="http://bioinformatics.ca//files/public/Pathways_2014_Module4_v2.pdf"><span>http://bioinformatics.ca//files/public/Pathways_2014_Module4_v2.pdf</span></a></p><p><a href="http://bioinformatics.ca//files/public/Pathways_2014_Module2.pdf"><span>http://bioinformatics.ca//files/public/Pathways_2014_Module2.pdf</span></a></p><p><span><strong>Impotant Database and Tools</strong>:</span></p><p>GeneMANIA, Cytoscape,&nbsp;<a href="http://www.ingenuity.com/products/ipa">IPA</a>&nbsp;and <a href="http://thomsonreuters.com/metacore/">Metacore</a> (Commerical ),&nbsp;<span>Pathway Commons, Reactome ,Panther, BioCyc, WikiPathways, Pathvisio, KEGG, NCI, Stringdb, Amigo,&nbsp;<span>WebGestalt ,<span>ConsensusPathDB ,GSEA,Blast2go</span></span></span></p><p><span><strong>Popular R based tools</strong>:</span></p><p><span>Reactome.db, ReactomePA, ClusterProfiler, Gage, SPIA, topGO, Pathview,DOSE,GOStat</span></p><p><span><strong>More</strong>:</span></p><p><a href="http://www.bioconductor.org/help/search/index.html?q=Enrichment+analysis+"><span>http://www.bioconductor.org/help/search/index.html?q=Enrichment+analysis+</span></a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32631/barrnap-bacterial-ribosomal-rna-predictor</guid>
	<pubDate>Fri, 12 May 2017 09:24:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32631/barrnap-bacterial-ribosomal-rna-predictor</link>
	<title><![CDATA[Barrnap: Bacterial ribosomal RNA predictor]]></title>
	<description><![CDATA[<p>Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).</p>
<p>It takes FASTA DNA sequence as input, and write GFF3 as output. It uses the new NHMMER tool that comes with HMMER 3.1 for HMM searching in RNA:DNA style. NHMMER binaries for 64-bit Linux and Mac OS X are included and will be auto-detected. Multithreading is supported and one can expect roughly linear speed-ups with more CPUs.&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/tseemann/barrnap" rel="nofollow">https://github.com/tseemann/barrnap</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41046/iseqqc-a-tool-for-expression-based-quality-control-in-rna-sequencing</guid>
	<pubDate>Sun, 16 Feb 2020 08:47:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41046/iseqqc-a-tool-for-expression-based-quality-control-in-rna-sequencing</link>
	<title><![CDATA[iSeqQC: a tool for expression-based quality control in RNA sequencing]]></title>
	<description><![CDATA[<p><span>iSeqQC, an expression-based QC tool that detects outliers either produced due to variable laboratory conditions or due to dissimilarity within a phenotypic group. iSeqQC implements various statistical approaches including unsupervised clustering, agglomerative hierarchical clustering and correlation coefficients to provide insight into outliers.</span></p>
<p><a href="http://cancerwebpa.jefferson.edu/iSeqQC/">http://cancerwebpa.jefferson.edu/iSeqQC/</a></p>
<p><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-3399-8">https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-3399-8</a></p><p>Address of the bookmark: <a href="https://github.com/gkumar09/iSeqQC" rel="nofollow">https://github.com/gkumar09/iSeqQC</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42033/seastar-systematic-evaluation-of-alternative-start-site-in-rna</guid>
	<pubDate>Thu, 13 Aug 2020 09:54:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42033/seastar-systematic-evaluation-of-alternative-start-site-in-rna</link>
	<title><![CDATA[SEASTAR: Systematic Evaluation of Alternative STArt site in RNA]]></title>
	<description><![CDATA[<p>SEASTAR (Systematic Evaluation of Alternative STArt site in RNA) is a software package for Transcription Start Site (TSS) identification and quantification using only RNA-seq data. It assembles novel TSSs based only on RNA-Seq data and merges them with known TSSs from a public database. This package enables high-quality TSS identification that is comparable to the highly sophisticated CAGE technology. This package is particularly useful for finding novel TSSs that contribute to transcriptome complexity along with identifying differential promoter utilization.</p>
<p>version 1.0.0 - updates several descriptions and tests. To achieve v0.9.4, one can visit&nbsp;<a href="https://github.com/zhyqin/SEASTAR-0.9.4">https://github.com/zhyqin/SEASTAR-0.9.4</a>&nbsp;for download.</p><p>Address of the bookmark: <a href="https://github.com/Xinglab/SEASTAR" rel="nofollow">https://github.com/Xinglab/SEASTAR</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44713/understanding-rna-seq-normalization-methods-tpm-vs-fpkm-vs-cpm</guid>
	<pubDate>Wed, 11 Dec 2024 00:59:15 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44713/understanding-rna-seq-normalization-methods-tpm-vs-fpkm-vs-cpm</link>
	<title><![CDATA[Understanding RNA-Seq Normalization Methods: TPM vs. FPKM vs. CPM]]></title>
	<description><![CDATA[<p>RNA sequencing (RNA-Seq) is a powerful technology used to study transcriptomes, providing insights into gene expression levels. However, raw RNA-Seq data requires normalization to account for sequencing depth and gene length, enabling accurate comparisons between genes and samples. Among the most widely used normalization methods are TPM (Transcripts Per Million), FPKM (Fragments Per Kilobase Million), and CPM (Counts Per Million). Each method has its unique principles and applications, which we&rsquo;ll explore in this blog.</p><h2>Why Normalize RNA-Seq Data?</h2><p>Normalization is a crucial step in RNA-Seq analysis for the following reasons:</p><ul>
<li>
<p><strong>Sequencing depth:</strong> Different RNA-Seq experiments produce varying numbers of reads, making direct comparisons between samples misleading.</p>
</li>
<li>
<p><strong>Gene length:</strong> Longer genes inherently generate more reads, irrespective of their actual expression level.</p>
</li>
<li>
<p><strong>Bias reduction:</strong> Normalization mitigates technical biases, enabling meaningful biological interpretation.</p>
</li>
</ul><h2>TPM (Transcripts Per Million)</h2><p>TPM measures the proportion of reads mapped to a transcript, normalized by transcript length and sequencing depth. It is calculated as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Proportionality:</strong> TPM values sum to 1,000,000 across all transcripts in a sample, making it easier to compare between samples.</p>
</li>
<li>
<p><strong>Intuitive interpretation:</strong> TPM values directly represent the abundance of transcripts in a sample.</p>
</li>
<li>
<p><strong>Preferred for comparisons:</strong> TPM facilitates between-sample comparisons better than FPKM.</p>
</li>
</ol><h2>FPKM (Fragments Per Kilobase Million)</h2><p>FPKM normalizes read counts by transcript length and sequencing depth, but without enforcing proportionality like TPM. It is defined as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Historical significance:</strong> FPKM was one of the first normalization methods used for RNA-Seq.</p>
</li>
<li>
<p><strong>Single-end vs. paired-end:</strong> In paired-end sequencing, FPKM becomes RPKM (Reads Per Kilobase Million).</p>
</li>
<li>
<p><strong>Limited utility:</strong> FPKM values are not as robust as TPM for cross-sample comparisons due to lack of proportionality.</p>
</li>
</ol><h2>CPM (Counts Per Million)</h2><p>CPM normalizes raw read counts by sequencing depth, without considering gene length. It is expressed as:</p><h3>Key Features:</h3><ol>
<li>
<p><strong>Simplicity:</strong> CPM is straightforward and computationally less intensive.</p>
</li>
<li>
<p><strong>Application:</strong> Suitable for non-length-dependent analyses, such as comparing total expression levels or differential expression analysis.</p>
</li>
<li>
<p><strong>Gene length agnostic:</strong> CPM does not correct for gene length, making it less ideal for measuring expression levels.</p>
</li>
</ol><h2>When to Use Each Method</h2><ul>
<li>
<p><strong>TPM:</strong> Best for comparing expression levels between samples, especially when transcript length and sequencing depth vary.</p>
</li>
<li>
<p><strong>FPKM:</strong> Useful for historical consistency but generally replaced by TPM.</p>
</li>
<li>
<p><strong>CPM:</strong> Ideal for differential expression analysis when gene length normalization is unnecessary.</p>
</li>
</ul><h2>Conclusion</h2><p>Choosing the right normalization method depends on the specific objectives of your RNA-Seq analysis. TPM&rsquo;s proportionality and robustness make it the preferred choice for most applications, while CPM serves well for differential expression studies. Although FPKM paved the way for RNA-Seq normalization, it has largely been supplanted by TPM in modern workflows. Understanding these methods and their nuances ensures accurate and meaningful interpretations of RNA-Seq data.</p><h3>References:</h3><ol>
<li>
<p>Li, B., &amp; Dewey, C. N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. <em>BMC Bioinformatics.</em></p>
</li>
<li>
<p>Trapnell, C., et al. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. <em>Nature Biotechnology.</em></p>
</li>
<li>
<p>Law, C. W., et al. (2014). voom: precision weights unlock linear model analysis tools for RNA-seq read counts. <em>Genome Biology.</em></p>
</li>
</ol>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/1471/24-mb-genome-size-for-worlds-biggest-virus</guid>
	<pubDate>Thu, 08 Aug 2013 10:05:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/1471/24-mb-genome-size-for-worlds-biggest-virus</link>
	<title><![CDATA[2.4 Mb Genome Size for World's Biggest Virus]]></title>
	<description><![CDATA[<p>The genome size of new discovered Pandoraviruses have roughly twice the size of the record-holding Megavirus genomic code. Interestingly only 6 percent of its genes resembled the genes other organisms. It is assume that it may come from a different origin.</p><p>For detail : http://www.sciencemag.org/content/341/6143/281</p><p>http://www.npr.org/blogs/health/2013/07/18/203298244/worlds-biggest-virus-may-have-ancient-roots</p><p>&nbsp;</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

</channel>
</rss>