<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43022?offset=30</link>
	<atom:link href="https://bioinformaticsonline.com/related/43022?offset=30" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30336/finding-patterns-in-biological-sequences</guid>
	<pubDate>Thu, 22 Dec 2016 10:30:49 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30336/finding-patterns-in-biological-sequences</link>
	<title><![CDATA[Finding Patterns in Biological Sequences]]></title>
	<description><![CDATA[<p>In this report we provide an overview of known techniques for discovery of patterns of biological sequences (DNA and proteins). We also provide biological motivation, and methods of biological verification of such patterns. Finally we list publicly available tools and databases for pattern discovery. On-line supplement is available through http://genetics.uwaterloo.ca/&sim;tvinar/cs798g/motif.</p><p>Address of the bookmark: <a href="http://engr.case.edu/li_jing/papers/00798gpattern.pdf" rel="nofollow">http://engr.case.edu/li_jing/papers/00798gpattern.pdf</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33398/tiny-python36-notebook</guid>
	<pubDate>Sat, 03 Jun 2017 03:16:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33398/tiny-python36-notebook</link>
	<title><![CDATA[Tiny Python3.6 Notebook]]></title>
	<description><![CDATA[<p><span>This is not so much an instructional manual, but rather notes, tables, and examples for Python syntax. It was created by the author as an additional resource during training, meant to be distributed as a physical notebook. Participants (who favor the physical characteristics of dead tree material) could add their own notes, thoughts, and have a valuable reference of curated examples.</span></p><p>Address of the bookmark: <a href="https://github.com/mattharrison/Tiny-Python-3.6-Notebook/blob/master/python.rst" rel="nofollow">https://github.com/mattharrison/Tiny-Python-3.6-Notebook/blob/master/python.rst</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44865/snp-analysis-unlocking-the-secrets-in-our-dna</guid>
	<pubDate>Wed, 16 Jul 2025 01:31:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44865/snp-analysis-unlocking-the-secrets-in-our-dna</link>
	<title><![CDATA[SNP Analysis: Unlocking the Secrets in Our DNA]]></title>
	<description><![CDATA[<p>Single Nucleotide Polymorphisms (SNPs) are the most common type of genetic variation in humans&mdash;and many other organisms. A single base change in the DNA sequence (for example, an A instead of a G) can influence everything from our eye color to our risk of developing diseases. Analyzing these tiny changes has become central to modern genetics, medicine, agriculture, and evolutionary biology.</p><p><strong>What are SNPs?</strong><br />SNPs (pronounced "snips") are positions in the genome where individuals differ by a single nucleotide. For example:</p><p>Reference: ...A T G C A T G A...<br />Variant:&nbsp; &nbsp; &nbsp;...A T G T A T G A...</p><p>Here, the C in the reference genome has been replaced by a T in the variant.</p><p>SNPs occur roughly every 300&ndash;1,000 bases in the human genome, meaning there are millions of them scattered throughout our DNA. Most SNPs have no effect on health, but some are linked to disease susceptibility, drug response, and other traits.</p><p><strong>Why Do We Analyze SNPs?</strong><br />1. Medical Genetics</p><p>Identify disease-associated variants (e.g., BRCA1/2 in breast cancer).</p><p>Predict drug response (pharmacogenomics).</p><p>Enable precision medicine by tailoring treatments.</p><p>2. Population Genetics &amp; Ancestry</p><p>Trace human migration and ancestry.</p><p>Study genetic diversity within and between populations.</p><p>3. Agriculture &amp; Animal Breeding</p><p>Select for desirable traits (drought resistance, yield, disease resistance).</p><p>Improve breeding efficiency in livestock.</p><p>4. Evolutionary Biology</p><p>Track natural selection.</p><p>Study adaptation in wild populations.</p><p><strong>How is SNP Analysis Performed?</strong><br />SNP analysis can be broadly divided into three steps:</p><p>SNP Detection<br />Genotyping arrays: Chips that test hundreds of thousands of known SNP positions simultaneously. Fast and affordable, widely used in consumer ancestry testing.</p><p>Whole-genome or whole-exome sequencing: Can detect known and novel SNPs across the genome.</p><p>Targeted sequencing or PCR: For focused analysis of specific regions.</p><p>Variant Calling<br />Sequencing data is aligned to a reference genome. Bioinformatics tools (e.g., GATK, bcftools) identify positions where the sequenced sample differs from the reference.</p><p>Annotation and Interpretation<br />Tools (e.g., SnpEff, VEP) predict the functional impact of SNPs.</p><p>Are the SNPs in coding regions? Do they cause amino acid changes? Are they known to be pathogenic?</p><p>Databases like dbSNP, ClinVar, and GWAS Catalog provide information on known associations.</p><p>Common Tools for SNP Analysis<br />Alignment: BWA, Bowtie2</p><p>Variant Calling: GATK, FreeBayes</p><p>Visualization: IGV, UCSC Genome Browser</p><p>Annotation: SnpEff, VEP</p><p>Statistical Analysis: PLINK, SNPTEST</p><p><strong>Challenges in SNP Analysis</strong><br />False positives/negatives: Sequencing errors, alignment issues.</p><p>Population stratification: Confounding in association studies.</p><p>Interpretation: Many SNPs have unknown or complex effects.</p><p>Researchers address these with rigorous quality control, large datasets, and increasingly sophisticated statistical models.</p><p><strong>The Future of SNP Analysis</strong><br />With advances in sequencing technology and AI-driven analysis, SNP studies are expanding:</p><p>Polygenic risk scores predict disease risk based on thousands of SNPs.</p><p>Large-scale biobanks (e.g., UK Biobank, All of Us) enable powerful genome-wide association studies (GWAS).</p><p>CRISPR and functional assays help validate SNP effects in the lab.</p><p>SNP analysis is at the heart of the genomic revolution, promising insights into biology, health, and evolution at unprecedented scale.</p><p><strong>Conclusion</strong><br />From diagnosing rare diseases to designing better crops, SNP analysis is a foundational tool in modern science. As our ability to sequence and interpret genomes improves, so will our understanding of these tiny&mdash;but mighty&mdash;variations in DNA.</p><p>&nbsp;</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34699/biological-file-format-tutorial</guid>
	<pubDate>Sun, 17 Dec 2017 18:13:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34699/biological-file-format-tutorial</link>
	<title><![CDATA[Biological file format tutorial]]></title>
	<description><![CDATA[<p>This section explains some of the commonly used file formats in bioinformatics. The information provided here is basic and designed to help users to distinguish the difference between different formats. Please refer user manual or other information resources on web for more details.</p>
<ol>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_fasta">FASTA</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_fastq">FASTQ</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_sam">SAM</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_bam">BAM</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_vcf">VCF</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_gff">GFF</a></li>
<li><a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/#fileformats_gtf">GTF</a></li>
</ol><p>Address of the bookmark: <a href="https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/" rel="nofollow">https://bioinformatics.uconn.edu/resources-and-events/tutorials/file-formats-tutorial/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40489/machine-learning-training-and-courses-in-bioinformatics</guid>
	<pubDate>Tue, 31 Dec 2019 19:33:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40489/machine-learning-training-and-courses-in-bioinformatics</link>
	<title><![CDATA[Machine learning training and courses in bioinformatics !]]></title>
	<description><![CDATA[<p>Machine learning techniques have been successful in analyzing biological data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. In this class, we will learn basics about probabilistic models and machine learning techniques. We will focus on probabilistic models (Markov models, Hidden Markov models, and Bayesian networks) for biological sequence analysis and systems biology. Other machine learning techniques, such as Naive bayes, neural networks and SVMs will only be covered briefly.</p>
<p>More at&nbsp;http://homes.sice.indiana.edu/yye/lab/teaching/spring2017-I529/</p>
<p>More tutorial at&nbsp;</p>
<p><a href="http://calla.rnet.missouri.edu/cheng_courses/mlbioinfo/mlbioinfo.htm">http://calla.rnet.missouri.edu/cheng_courses/mlbioinfo/mlbioinfo.htm</a></p>
<p><a href="http://www.raetschlab.org/lectures/MLBioinformatics">http://www.raetschlab.org/lectures/MLBioinformatics</a></p>
<p><a href="http://www.raetschlab.org/lectures/bertinoro08">http://www.raetschlab.org/lectures/bertinoro08</a></p>
<p>Book at&nbsp;</p>
<p><a href="https://personal.utdallas.edu/~pradiptaray/teaching/7_deep_learning_bioinfo.pdf">https://personal.utdallas.edu/~pradiptaray/teaching/7_deep_learning_bioinfo.pdf</a></p><p>Address of the bookmark: <a href="http://homes.sice.indiana.edu/yye/lab/teaching/spring2017-I529/" rel="nofollow">http://homes.sice.indiana.edu/yye/lab/teaching/spring2017-I529/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43631/pangolin-tutorial</guid>
	<pubDate>Fri, 10 Dec 2021 05:58:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43631/pangolin-tutorial</link>
	<title><![CDATA[Pangolin tutorial !]]></title>
	<description><![CDATA[<p><span>This is a tutorial for using the Pangolin Web Application. For information on using the command line tool, please visit the&nbsp;</span><a href="https://cov-lineages.org/resources/pangolin/usage.html">command line tool usage page</a><span>.</span></p>
<p>https://cov-lineages.org/resources/pangolin/tutorial.html</p><p>Address of the bookmark: <a href="https://cov-lineages.org/resources/pangolin/tutorial.html" rel="nofollow">https://cov-lineages.org/resources/pangolin/tutorial.html</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40583/trelliscope-flexibly-visualize-large-complex-data-in-great-detail-from-within-the-r-statistical-programming-environment</guid>
	<pubDate>Tue, 21 Jan 2020 04:22:49 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40583/trelliscope-flexibly-visualize-large-complex-data-in-great-detail-from-within-the-r-statistical-programming-environment</link>
	<title><![CDATA[Trelliscope: flexibly visualize large, complex data in great detail from within the R statistical programming environment.]]></title>
	<description><![CDATA[<p>Trelliscope provides a way to flexibly visualize large, complex data in great detail from within the R statistical programming environment. Trelliscope is a component in the<span>&nbsp;</span><a href="http://deltarho.org/docs-trelliscope/deltarho.org">DeltaRho</a><span>&nbsp;</span>environment.</p>
<p>For those familiar with<span>&nbsp;</span><a href="http://cm.bell-labs.com/cm/ms/departments/sia/project/trellis/">Trellis Display</a>,<span>&nbsp;</span><a href="http://docs.ggplot2.org/0.9.3.1/facet_wrap.html">faceting in ggplot</a>, or the notion of<span>&nbsp;</span><a href="http://en.wikipedia.org/wiki/Small_multiple">small multiples</a>, Trelliscope provides a scalable way to break a set of data into pieces, apply a plot method to each piece, and then arrange those plots in a grid and interactively sort, filter, and query panels of the display based on metrics of interest. With Trelliscope, we are able to create multipanel displays on data with a very large number of subsets and view them in an interactive and meaningful way.</p><p>Address of the bookmark: <a href="http://deltarho.org/docs-trelliscope/#introduction" rel="nofollow">http://deltarho.org/docs-trelliscope/#introduction</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41957/majiq-2-is-released</guid>
	<pubDate>Thu, 09 Jul 2020 03:06:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41957/majiq-2-is-released</link>
	<title><![CDATA[MAJIQ 2 is released !]]></title>
	<description><![CDATA[<p>&nbsp;</p>
<p>Ability to detect, quantify, and visualize complex and de-novo splicing variations from RNASeq.</p>
<p>MAJIQ&rsquo;s accuracy compares favorably to other algorithms.</p>
<p>MAJIQ 2 is *way* faster, more memory and I/O efficient</p>
<p>New visualization (VOILA 2.0) Ability to analyze hundreds and thousands of samples Why so negative? (Support for a confident negative set)</p>
<p><span>Finally, a major reason we are excited about MAJIQ 2.0 is that it sets the code base for many new exciting algorithmic and visualization improvements, with application to new research questions so stay tuned!</span></p>
<p><span>More at <a href="https://biociphers.wordpress.com/2019/04/01/majiq-2-is-out/">https://biociphers.wordpress.com/2019/04/01/majiq-2-is-out/</a></span></p><p>Address of the bookmark: <a href="https://majiq.biociphers.org/" rel="nofollow">https://majiq.biociphers.org/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44803/basics-of-deseq2-differential-expression-made-simple</guid>
	<pubDate>Wed, 28 May 2025 06:47:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44803/basics-of-deseq2-differential-expression-made-simple</link>
	<title><![CDATA[Basics of DESeq2: Differential Expression Made Simple]]></title>
	<description><![CDATA[<p>DESeq2 is a powerful and widely-used R package that identifies differentially expressed genes (DEGs) from RNA-seq data. Whether you're comparing treated vs untreated samples, disease vs healthy conditions, or wild-type vs mutant strains, DESeq2 helps you statistically determine which genes are significantly up- or down-regulated.</p><p><strong>What Does DESeq2 Do?</strong><br />DESeq2 analyzes count data&mdash;the number of sequencing reads that map to each gene. It:</p><p>Normalizes the data to account for sequencing depth and library size.</p><p>Estimates variance (dispersion) for each gene.</p><p>Fits a model to compare groups (e.g., control vs treated).</p><p>Calculates fold-changes and p-values to determine significance.</p><p><strong>Installing DESeq2</strong></p><p><br />You can install DESeq2 via Bioconductor in R:</p><p>if (!requireNamespace("BiocManager", quietly = TRUE))<br /> install.packages("BiocManager")<br />BiocManager::install("DESeq2")</p><p><br />Inputs Needed</p><p><br />A count matrix: genes as rows, samples as columns (raw counts, not normalized).</p><p>A sample metadata table (also called colData): defines the condition/group for each sample.</p><blockquote><p>Example:<br /># Count matrix (rows = genes, columns = samples)<br />counts &lt;- read.csv("counts.csv", row.names = 1)</p><p># Sample metadata<br />colData &lt;- data.frame(<br /> row.names = colnames(counts),<br /> condition = c("control", "control", "treated", "treated")<br />)</p><p>DESeq2 Workflow</p><p>1. Load the package<br />library(DESeq2)<br />2. Create a DESeqDataSet object<br />dds &lt;- DESeqDataSetFromMatrix(countData = counts,<br /> colData = colData,<br /> design = ~ condition)<br />3. Run the differential expression analysis<br />dds &lt;- DESeq(dds)<br />4. Get the results<br />res &lt;- results(dds)<br />head(res)<br />This gives a table with:</p><p>log2FoldChange: how much expression changed</p><p>pvalue: statistical significance</p><p>padj: adjusted p-value (FDR corrected)</p></blockquote><p><strong>Visualization (Optional but Powerful)</strong></p><blockquote><p><br />MA Plot<br />plotMA(res, ylim = c(-2, 2))</p><p>Volcano Plot (custom)<br />library(ggplot2)<br />res$significant &lt;- res$padj &lt; 0.05<br />ggplot(res, aes(x=log2FoldChange, y=-log10(padj), color=significant)) +<br /> geom_point() +<br /> theme_minimal()</p><p>Heatmap of Top Genes<br />library(pheatmap)<br />topgenes &lt;- head(order(res$padj), 20)<br />vsd &lt;- vst(dds, blind=FALSE)<br />pheatmap(assay(vsd)[topgenes, ])</p><p>Tips for Best Results<br />Use raw counts (not normalized or TPM/RPKM values).</p><p>Have replicates: DESeq2 relies on variance estimates, so at least 3 per group is ideal.</p><p>Watch out for batch effects&mdash;include them in your design if needed (e.g., ~ batch + condition).</p></blockquote><p><strong>Summary</strong></p><p>Step Purpose<br />DESeqDataSetFromMatrix() Load your data into DESeq2<br />DESeq() Run the differential expression analysis<br />results() Extract the output (log fold change, p-values, etc.)<br />plotMA() / ggplot2 / pheatmap Visualize the results</p><p><strong>Final Thoughts</strong><br />DESeq2 is an essential tool for RNA-seq data analysis. It abstracts away much of the complexity of statistical modeling, while still giving you control when needed. Whether you're a bioinformatician or a wet-lab biologist, DESeq2 offers both ease of use and analytical power.</p><p>&nbsp;</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/1737/perl-in-a-day</guid>
	<pubDate>Sat, 10 Aug 2013 21:14:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/1737/perl-in-a-day</link>
	<title><![CDATA[Perl in a day !!]]></title>
	<description><![CDATA[<p>This pdf based tutorial in good resource to understand the basic of Perl in a day</p><p><a href="http://ritg.med.harvard.edu/training/perl/RC_Perl_Intro.pdf">http://ritg.med.harvard.edu/training/perl/RC_Perl_Intro.pdf</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

</channel>
</rss>