<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: R and Bioconductor's pages]]></title>
	<link>https://bioinformaticsonline.com/pages/group/93/all?</link>
	<atom:link href="https://bioinformaticsonline.com/pages/group/93/all?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/33869/import-r-data</guid>
	<pubDate>Wed, 12 Jul 2017 08:30:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/33869/import-r-data</link>
	<title><![CDATA[Import R Data]]></title>
	<description><![CDATA[<p>It is often necessary to import sample textbook data into R before you start working on your homework.</p><div id="node-69"><div><p><strong>Excel File</strong></p><p>Quite frequently, the sample data is in&nbsp;<span>Excel&nbsp;</span>format, and needs to be imported into R prior to use. For this, we can use the function&nbsp;<span>read.xls&nbsp;</span>from the&nbsp;<span>gdata&nbsp;</span>package. It reads from an Excel spreadsheet and returns a&nbsp;<a href="http://www.r-tutor.com/r-introduction/data-frame">data frame</a>. The following shows how to load an Excel spreadsheet named&nbsp;<span>"mydata.xls"</span>. This method requires Perl runtime to be present in the system.</p><blockquote><div id="listing-68"><span><a></a></span>&gt;&nbsp;library(gdata)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;load&nbsp;gdata&nbsp;package&nbsp;<br /><span><a></a></span>&gt;&nbsp;help(read.xls)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;documentation&nbsp;<br /><span><a></a></span>&gt;&nbsp;mydata&nbsp;=&nbsp;read.xls("mydata.xls")&nbsp;&nbsp;#&nbsp;read&nbsp;from&nbsp;first&nbsp;sheet</div></blockquote><p>Alternatively, we can use the function&nbsp;<span>loadWorkbook&nbsp;</span>from the&nbsp;<span>XLConnect&nbsp;</span>package to read the entire workbook, and then load the worksheets with&nbsp;<span>readWorksheet</span>. The&nbsp;<span>XLConnect&nbsp;</span>package requires Java to be pre-installed.</p><blockquote><div id="listing-69"><span><a></a></span>&gt;&nbsp;library(XLConnect)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;load&nbsp;XLConnect&nbsp;package&nbsp;<br /><span><a></a></span>&gt;&nbsp;wk&nbsp;=&nbsp;loadWorkbook("mydata.xls")&nbsp;<br /><span><a></a></span>&gt;&nbsp;df&nbsp;=&nbsp;readWorksheet(wk,&nbsp;sheet="Sheet1")</div></blockquote><p>&nbsp;</p><h4><a></a>Minitab File</h4><p>If the data file is in&nbsp;<span>Minitab Portable Worksheet&nbsp;</span>format, it can be opened with the function&nbsp;<span>read.mtp&nbsp;</span>from the&nbsp;<span>foreign&nbsp;</span>package. It returns a&nbsp;<a href="http://www.r-tutor.com/r-introduction/list">list</a>&nbsp;of components in the Minitab worksheet.</p><blockquote><div id="listing-70"><span><a></a></span>&gt;&nbsp;library(foreign)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;load&nbsp;the&nbsp;foreign&nbsp;package&nbsp;<br /><span><a></a></span>&gt;&nbsp;help(read.mtp)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;documentation&nbsp;<br /><span><a></a></span>&gt;&nbsp;mydata&nbsp;=&nbsp;read.mtp("mydata.mtp")&nbsp;&nbsp;#&nbsp;read&nbsp;from&nbsp;.mtp&nbsp;file</div></blockquote><p>&nbsp;</p><h4><a></a>SPSS File</h4><p>For the data files in&nbsp;<span>SPSS&nbsp;</span>format, it can be opened with the function&nbsp;<span>read.spss&nbsp;</span>also from the&nbsp;<span>foreign&nbsp;</span>package. There is a&nbsp;<span>"to.data.frame"&nbsp;</span>option for choosing whether a data frame is to be returned. By default, it returns a list of components instead.</p><blockquote><div id="listing-71"><span><a></a></span>&gt;&nbsp;library(foreign)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;load&nbsp;the&nbsp;foreign&nbsp;package&nbsp;<br /><span><a></a></span>&gt;&nbsp;help(read.spss)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;documentation&nbsp;<br /><span><a></a></span>&gt;&nbsp;mydata&nbsp;=&nbsp;read.spss("myfile",&nbsp;to.data.frame=TRUE)</div></blockquote><p>&nbsp;</p><h4><a></a>Table File</h4><p>A data table can resides in a text file. The cells inside the table are separated by blank characters. Here is an example of a table with 4 rows and 3 columns.</p><blockquote><div id="listing-72"><span><a></a></span>100&nbsp;&nbsp;&nbsp;a1&nbsp;&nbsp;&nbsp;b1&nbsp;<br /><span><a></a></span>200&nbsp;&nbsp;&nbsp;a2&nbsp;&nbsp;&nbsp;b2&nbsp;<br /><span><a></a></span>300&nbsp;&nbsp;&nbsp;a3&nbsp;&nbsp;&nbsp;b3&nbsp;<br /><span><a></a></span>400&nbsp;&nbsp;&nbsp;a4&nbsp;&nbsp;&nbsp;b4</div></blockquote><p>Now copy and paste the table above in a file named&nbsp;<span>"mydata.txt"&nbsp;</span>with a text editor. Then load the data into the workspace with the function&nbsp;<span>read.table</span>.</p><blockquote><div id="listing-73"><span><a></a></span>&gt;&nbsp;mydata&nbsp;=&nbsp;read.table("mydata.txt")&nbsp;&nbsp;#&nbsp;read&nbsp;text&nbsp;file&nbsp;<br /><span><a></a></span>&gt;&nbsp;mydata&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;print&nbsp;data&nbsp;frame&nbsp;<br /><span><a></a></span>&nbsp;&nbsp;&nbsp;V1&nbsp;V2&nbsp;V3&nbsp;<br /><span><a></a></span>1&nbsp;100&nbsp;a1&nbsp;b1&nbsp;<br /><span><a></a></span>2&nbsp;200&nbsp;a2&nbsp;b2&nbsp;<br /><span><a></a></span>3&nbsp;300&nbsp;a3&nbsp;b3&nbsp;<br /><span><a></a></span>4&nbsp;400&nbsp;a4&nbsp;b4</div></blockquote><p>For further detail of the function&nbsp;<span>read.table</span>, please consult the R documentation.</p><blockquote><div id="listing-74"><span><a></a></span>&gt;&nbsp;help(read.table)</div></blockquote><p>&nbsp;</p><h4><a></a>CSV File</h4><p>The sample data can also be in&nbsp;<span>comma separated values&nbsp;</span>(CSV) format. Each cell inside such data file is separated by a special character, which usually is a comma, although other characters can be used as well.</p><p>The first row of the data file should contain the column names instead of the actual data. Here is a sample of the expected format.</p><blockquote><div id="listing-75"><span><a></a></span>Col1,Col2,Col3&nbsp;<br /><span><a></a></span>100,a1,b1&nbsp;<br /><span><a></a></span>200,a2,b2&nbsp;<br /><span><a></a></span>300,a3,b3</div></blockquote><p>After we copy and paste the data above in a file named&nbsp;<span>"mydata.csv"&nbsp;</span>with a text editor, we can read the data with the function&nbsp;<span>read.csv</span>.</p><blockquote><div id="listing-76"><span><a></a></span>&gt;&nbsp;mydata&nbsp;=&nbsp;read.csv("mydata.csv")&nbsp;&nbsp;#&nbsp;read&nbsp;csv&nbsp;file&nbsp;<br /><span><a></a></span>&gt;&nbsp;mydata&nbsp;<br /><span><a></a></span>&nbsp;&nbsp;Col1&nbsp;Col2&nbsp;Col3&nbsp;<br /><span><a></a></span>1&nbsp;&nbsp;100&nbsp;&nbsp;&nbsp;a1&nbsp;&nbsp;&nbsp;b1&nbsp;<br /><span><a></a></span>2&nbsp;&nbsp;200&nbsp;&nbsp;&nbsp;a2&nbsp;&nbsp;&nbsp;b2&nbsp;<br /><span><a></a></span>3&nbsp;&nbsp;300&nbsp;&nbsp;&nbsp;a3&nbsp;&nbsp;&nbsp;b3</div></blockquote><p>In various European locales, as the comma character serves as the decimal point, the function&nbsp;<span>read.csv2&nbsp;</span>should be used instead. For further detail of the&nbsp;<span>read.csv&nbsp;</span>and&nbsp;<span>read.csv2&nbsp;</span>functions, please consult the R documentation.</p><blockquote><div id="listing-77"><span><a></a></span>&gt;&nbsp;help(read.csv)</div></blockquote><p>&nbsp;</p><h4><a></a>Working Directory</h4><p>Finally, the code samples above assume the data files are located in the R&nbsp;<span>working</span>&nbsp;<span>directory</span>, which can be found with the function&nbsp;<span>getwd</span>.</p><blockquote><div id="listing-78"><span><a></a></span>&gt;&nbsp;getwd()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#&nbsp;get&nbsp;current&nbsp;working&nbsp;directory</div></blockquote><p>You can select a different working directory with the function&nbsp;<span>setwd()</span>, and thus avoid entering the full path of the data files.</p><blockquote><div id="listing-79"><span><a></a></span>&gt;&nbsp;setwd("")&nbsp;&nbsp;&nbsp;#&nbsp;set&nbsp;working&nbsp;directory</div></blockquote><p>Note that the forward slash should be used as the path separator even on Windows platform.</p><blockquote><div id="listing-80"><span><a></a></span>&gt;&nbsp;setwd("C:/MyDoc")</div></blockquote></div></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21365/a-guide-for-complete-r-beginners</guid>
	<pubDate>Fri, 20 Feb 2015 23:36:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21365/a-guide-for-complete-r-beginners</link>
	<title><![CDATA[A guide for complete R beginners !]]></title>
	<description><![CDATA[<p>This tutorial is intended to introduce users quickly to the basics of R, focusing on a few common tasks that &nbsp;biologists need to perform &nbsp;some basic analysis: &nbsp;load a table, plot some graphs, and perform some basic statistics. More extensive tutorials can be found on the project website and via bioconductor (not covered here).</p><p><em><span style="text-decoration: underline;">R-language: </span></em><a href="http://www.r-project.org/"><span style="color: #000080;"><span style="text-decoration: underline;"><em>http://www.</em></span></span><span style="color: #000080;"><span style="text-decoration: underline;"><em><strong>r</strong></em></span></span><span style="color: #000080;"><span style="text-decoration: underline;"><em>-project.org</em></span></span></a></p><p><em>BioConductor</em>:&nbsp;<a href="http://www.bioconductor.org/">http://www.bioconductor.org</a></p><p><strong>Advantages of R</strong></p><ul>
<li>Free!</li>
<li>Powerful, many libraries have been created to perform application specific tasks. e.g. analysis of microarray experiments and Next-Gen sequencing (bioconductor: including Bioseq group).</li>
<li>Presentation quality graphics
<ul>
<li>Save as a png, pdf or svg</li>
</ul>
</li>
<li>History
<ul>
<li>What you do can be saved for the next time you use R.</li>
<li>Ability to turn it into an automated script to perform again and again on different data</li>
</ul>
</li>
</ul><p><strong>Disadvantages</strong></p><ul>
<li>Lack of a comprehensive graphical user interface, but two do exist: However some do exist:&nbsp;R commander: <a href="http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/">http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/</a> and&nbsp;Limma-gui (microarrays) : <a href="http://bioinf.wehi.edu.au/limmaGUI/">http://bioinf.wehi.edu.au/limmaGUI/</a></li>
</ul><p><strong>Preparation</strong></p><ul>
<li>(Optional) Download and save the tutorial data set from
<ul>
<li>http://bioinformatics.knowledgeblog.org/wp-content/uploads/bioinf/kerr/data.tsv</li>
<li>Start R (type R on a Linux or Mac terminal, or find the starting link from PC)</li>
</ul>
</li>
</ul><p><strong>Getting More Help</strong></p><ul>
<li>Project Home page
<ul>
<li><span style="color: #000080;"><span style="text-decoration: underline;"><a href="http://www.r-project.org/">http://www.r-project.org/</a></span></span></li>
<li>Check out the &lsquo;introduction to R&rsquo;, which is a much more in depth guide .</li>
<li>Also R has a built-in help system (see later)</li>
</ul>
</li>
</ul><p><strong>Working directory</strong></p><p>This is the directory used to store your data and results. It is useful if it is also the directory where your input data is stored.</p><ul>
<li>Mac/Linux: this is the directory where you typed in R</li>
<li>PC: Change using the change working directory option</li>
</ul>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</guid>
	<pubDate>Mon, 02 Jun 2014 18:03:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</link>
	<title><![CDATA[Next generation sequencing in R or bioconductor environment]]></title>
	<description><![CDATA[<p>There are many R software and bioconductor packages for NGS data analysis, some of them are as follows</p><h3><a name="TOC-Biostrings" id="TOC-Biostrings"></a>Biostrings</h3><p>The Biostrings package from Bioconductor provides an advanced environment for efficient sequence management and analysis in R. It contains many speed and memory effective string containers, string matching algorithms, and other utilities, for fast manipulation of large sets of biological sequences. The objects and functions provided by Biostrings form the basis for many other sequence analysis packages. <a href="http://bioconductor.org/packages/release/bioc/html/Biostrings.html">Documentation</a></p><div><div style="text-align: left;"><div style="color: #000000;"><h4><a name="TOC-IRanges-Overview" id="TOC-IRanges-Overview"></a>IRanges Overview</h4><p>IRanges provides the low-level infrastructure and containers for handling sets of integer ranges within Bioconductor's BioC-Seq domain. Its classes and methods provide support for many more high-level packages like GenomicRanges, ShortRead, Rsamtools, etc. <a href="http://bioconductor.org/packages/release/bioc/html/IRanges.html">Documentation</a></p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-GenomicRanges-Overview" id="TOC-GenomicRanges-Overview"></a>GenomicRanges Overview</h4><p>The <em>GenomicRanges</em> package serves as the foundation for representing genomic locations within the Bioconductor project. It is built upon the <em>IRanges</em> infrastructure and defines three major data containers - <em>GRanges, GRangesList</em> and <em>GappedAlignments</em> - which are supporting other important BioC-Seq packages including <em>ShortRead, Rsamtools, rtracklayer, GenomicFeatures</em> and <em>BSgenome</em>.&nbsp; Compared to the IRanges container, the GRanges/<em>GRangesList</em> classes are more flexible and extensible to store additional information about sequence ranges, such as chromosome identifiers (sequence space), strand information and annotation data. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p></div></div></div></div><h3><a name="TOC-Motif-Discovery" id="TOC-Motif-Discovery"></a>Motif Discovery</h3><h4><a name="TOC-cosmo" id="TOC-cosmo"></a>cosmo</h4><p>The cosmo package allows to search a set of unaligned DNA sequences for a shared motif that may function as transcription factor binding site. The algorithm extends the popular motif discovery tool MEME (Bailey and Elkan, 1995) in that it allows the search to be supervised by specifying a set of constraints that the motif to be discovered must satisfy. <a href="http://bioconductor.org/packages/release/bioc/html/cosmo.html">Documentation</a></p></div><div>
<p><span></span><span></span></p>
<div style="color: #0000ff;"><h4><a name="TOC-BCRANK" id="TOC-BCRANK"></a>BCRANK</h4><p>BCRANK is a method that takes a ranked list of genomic regions as input and outputs short DNA sequences that are overrepresented in some part of the list. The algorithm was developed for detecting transcription factor (TF) binding sites in a large number of enriched regions from high-throughput ChIP-chip or ChIP-seq experiments, but it can be applied to any ranked list of DNA sequences. Documentation</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/BCRANK.html"></a></p>
<p>rGADEM: <a href="http://bioconductor.org/packages/devel/bioc/html/rGADEM.html">Documentation</a></p><p>MotIV: <a href="http://bioconductor.org/packages/devel/bioc/html/MotIV.html">Documentation</a></p></div><h3><a name="TOC-ShortRead" id="TOC-ShortRead"></a>ShortRead</h3><p>The ShortRead package provides input, quality control, filtering, parsing, and manipulation functionality for short read sequences produced by high throughput sequencing technologies. While support is provided for many sequencing technologies, this package is primairly focused on Solexa/Illumina reads. <a href="http://bioconductor.org/packages/release/bioc/html/ShortRead.html">Documentation</a></p><h3><a name="TOC-Rsamtools" id="TOC-Rsamtools"></a>Rsamtools</h3><p>Rsamtools provides functions for parsing and inspecting samtools BAM formatted binary alignment data. SAM/BAM is quickly becoming a universal standard alignment format, and is now supported by a wide variety of alignment tools. <a href="http://bioconductor.org/help/bioc-views/2.7/bioc/html/Rsamtools.html">Documentation</a></p>
<p><a href="http://samtools.sourceforge.net/">Samtools Website</a><br /> <a href="http://bio-bwa.sourceforge.net/">BWA (Burrows-Wheeler Alignment) Website</a><br /><span style="color: #0000ff;"></span></p>
<div style="color: #000000;">&nbsp;</div></div><div>
<p><span style="color: #000000;">Additional tools for SNP analysis:&nbsp;</span></p>
<p><a href="http://bioconductor.org/help/bioc-views/release/bioc/html/snpMatrix.html">snpMatrix</a></p><h3><a name="TOC-BSgenome" id="TOC-BSgenome"></a>BSgenome</h3><p>BSgenome provides an object oriented infrastructure for interacting with a Biostring based genome sequence. BSgenome packages exist for many common genomes, and can be created to represent custom genomes. See the "How to forge a BSgenome data package" Vignette for instructions to create a new BSgenome package if a prebuilt package does not exist for your organism. <a href="http://bioconductor.org/packages/release/bioc/html/BSgenome.html">Documentation</a></p><h3><a name="TOC-rtracklayer" id="TOC-rtracklayer"></a>rtracklayer</h3><p>rtracklayer provides an interface for exporting annotation feature data to various genome browsers and file formats (such as GFF). See the Small RNA Profiling exercise for an example of using rtracklayer to visualize alignment coverage. <a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">Documentation</a></p><h3><a name="TOC-biomaRt" id="TOC-biomaRt"></a>biomaRt</h3><p>The biomaRt package, provides an interface to a growing collection of databases implementing the BioMart software suite (http:// www.biomart.org). The package enables online retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas. This data is retrieved automatically via the Internet, so it's recommended that you cache the data locally, or check versions if your code will be adversely affected by updates to these data. <a href="http://bioconductor.org/packages/release/bioc/html/biomaRt.html">Documentation</a></p><h3><a name="TOC-ChIP-Seq-Analysis-Packages" id="TOC-ChIP-Seq-Analysis-Packages"></a>ChIP-Seq Analysis Packages</h3><p>Bioconductor provides various packages for analyzing and visualizing ChIP-Seq data. Only a small selection of these packages is introduced here. Additional useful introductions to this topic are: <a href="http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/">BioC ChIP-seq Case Study</a> and BioC <a href="http://www.bioconductor.org/help/course-materials/2009/SeattleNov09/ChIP-seq/">ChIP-Seq</a>.</p><h4><a name="TOC-chipseq" id="TOC-chipseq"></a>chipseq</h4><p>The chipseq package combines a variety of HT-Seq packages to a pipeline for ChIP-Seq data analysis. <a href="http://bioconductor.org/packages/release/bioc/html/chipseq.html">Documentation</a></p><h4><a name="TOC-BayesPeak" id="TOC-BayesPeak"></a>BayesPeak</h4><p>BayesPeak is a peak calling package for identifying DNA binding sites of proteins in ChIP-Seq experiments. Its algorithm uses hidden Markov models (HMM) and Bayesian statistical methods. The following sample code introduces the identification of peaks with the BayesPeak package as well as the incorporation of read coverage information obtained by the chipseq package. <a href="http://bioconductor.org/packages/release/bioc/html/BayesPeak.html">Documentation</a> [ <a href="http://www.biomedcentral.com/1471-2105/10/299">Publication</a> ]</p><h4><a name="TOC-PICS" id="TOC-PICS"></a>PICS</h4><p>The PICS package applies probabilistic inference to aligned-read ChIP-Seq data in order to identify regions bound by transcription factors. PICS identifies enriched regions by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model. The following sample code uses the test data set from the above BayesPeak package in order to compare the results from both methods by identifying their consensus peak set. <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">Documentation</a> [ <a href="http://www.hubmed.org/display.cgi?uids=20528864">Publication</a> ]</p><h4><a name="TOC-ChIPpeakAnno" id="TOC-ChIPpeakAnno"></a>ChIPpeakAnno</h4><p>The ChIPpeakAnno package provides. batch annotation of the peaks identified from either ChIP-seq or ChIP-chip experiments. It includes functions to retrieve the sequences around peaks, obtain enriched Gene Ontology (GO) terms, find the nearest gene, exon, miRNA or custom features such as most conserved elements and other transcription factor binding sites supplied by users. The package leverages the biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest and stat packages. <a href="http://bioconductor.org/packages/release/bioc/html/ChIPpeakAnno.html">Documentation</a></p><h4><a name="TOC-Additional-ChIP-Seq-Packages" id="TOC-Additional-ChIP-Seq-Packages"></a>Additional ChIP-Seq Packages</h4><p>DiffBind: <a href="http://www.bioconductor.org/packages/release/bioc/html/DiffBind.html">Documentation</a></p><p>MOSAICS: <a href="http://bioconductor.org/packages/devel/bioc/html/mosaics.html">Documentation</a></p><p>iSeq: <a href="http://bioconductor.org/packages/release/bioc/html/iSeq.html">Documentation</a></p><p>ChIPseqR: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPseqR.html">Documentation</a></p><p>ChiPsim: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPsim.html">Documentation</a></p><p>CSAR: <a href="http://www.bioconductor.org/packages/devel/bioc/html/CSAR.html">Documentation</a></p><p>ChIP-Seq Pipeline: <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">PICS</a>, rGADEM and MotIV (<a href="http://www.rglab.org/pics-and-bioconductor/">developer web site</a>)</p><p>SPP: <a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/">ChIP-seq processing pipeline</a></p><p><a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/tutorial.html">SPP Tutorial</a></p><p><a href="http://liulab.dfci.harvard.edu/MACS/index.html">MACS</a></p><p><a href="http://gmdd.shgmo.org/Computational-Biology/ChIP-Seq/download/SIPeS">SIPeS</a></p><h3><a name="TOC-RNA-Seq-Analysis" id="TOC-RNA-Seq-Analysis"></a>RNA-Seq Analysis</h3><h4><a name="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-" id="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-"></a>Counting Reads that Overlap with Annotation Ranges&nbsp;</h4><p>The GenomicRanges package provides support for importing into R short read alignment data in BAM format (via Rsamtools) and associating them with genomic feature ranges, such as exons or genes. This way one can quantify the number of reads aligning to annotated genomic regions. The package defines general purpose containers for storing genomic intervals as well as more specialized containers for storing alignments against a reference genome. The two main functions for read counting provided by this infrastructure are <span>countOverlaps <span style="color: #000000;"><span>and</span></span> summarizeOverlaps</span>. For their proper usage, it is important to read the corresponding <a href="http://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf">PDF manual</a>. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-DESeq" id="TOC-Differential-Gene-Expression-Analysis-with-DESeq"></a>Differential Gene Expression Analysis with DESeq</h4><p>The DESeq package contains functions to call differentially expressed genes (DEGs) in count tables based on a model using the negative binomial distribution. It expects as input a data frame with the raw read counts per region/gene of interest (rows) for each test sample (columns).&nbsp; Such a count table can be imported into R or generated from BAM alignment files using the <span>countOverlaps</span> function as introduced above. <a href="http://www.bioconductor.org/packages/release/bioc/html/DESeq.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-edgeR" id="TOC-Differential-Gene-Expression-Analysis-with-edgeR"></a>Differential Gene Expression Analysis with edgeR</h4><p>The edgeR package uses empirical Bayes estimation and exact tests based on the negative binomial distribution to call differentially expressed genes (DEGs) in count data.&nbsp;</p>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/edgeR.html">Documentation</a></p>
<p><span style="color: #000000;">A variety of additional R packages are available for normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG): <br /> </span></p><p><a href="http://bioconductor.org/packages/devel/bioc/html/easyRNASeq.html">easyRNASeq</a> (simplifies read counting per genome feature)</p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html">DEXSeq</a> (Inference of differential exon usage);&nbsp;<a href="http://www.bioconductor.org/packages/release/data/experiment/html/parathyroidSE.html">parathyroidSE</a> explains how to generate exon read counts in R</p><p><a href="http://bioconductor.org/packages/release/bioc/html/DEGseq.html">DEGseq</a></p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/baySeq.html">baySeq</a> (also see: <a href="http://www.bioconductor.org/packages/release/bioc/html/segmentSeq.html">segmentSeq</a>)</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a> (<a href="http://www.hubmed.org/display.cgi?uids=20167110">Bullard et al. 2010</a>)</p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-Detection-of-Alternative-Splice-Junctions" id="TOC-Detection-of-Alternative-Splice-Junctions"></a>Detection of Alternative Splice Junctions</h4>
<p><span style="color: #000000;">Another utility of RNA-Seq experiments is the analysis of splice junctions. The following software suggestions provide this utility:</span></p>
<p><a href="http://woldlab.caltech.edu/rnaseq/">ERANGE<br /> </a><a href="http://tophat.cbcb.umd.edu/">TopHat</a></p><p><a href="http://biogibbs.stanford.edu/%7Ekinfai/SpliceMap/">SpliceMap</a></p><p><a href="http://solidsoftwaretools.com/gf/project/splitseek/">SplitSeek</a></p><h3><a name="TOC-DNA-Methylation-Data-Analysis" id="TOC-DNA-Methylation-Data-Analysis"></a>DNA-Methylation Data Analysis</h3><div><ul>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/help/course-materials/2012/BiocEurope2012/mattia_pelizzola_methylPipe.pdf">methylPipe</a></span></li>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/packages/devel/bioc/html/bsseq.html">bsseq</a></span></li>
<li><a href="http://www.bioconductor.org/packages/devel/bioc/html/BiSeq.html">BiSeq</a></li>
<li>Much more under <a href="http://www.bioconductor.org/packages/devel/BiocViews.html#___DNAMethylation">BiocViews</a></li>
</ul></div></div></div><h3><a name="TOC-HT-Seq-Data-Visualization" id="TOC-HT-Seq-Data-Visualization"></a>HT-Seq Data Visualization</h3>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/ggbio.html">ggbio</a>: ggplot2 extension for genomics data (<a href="http://tengfei.github.com/ggbio/">online manual</a>) <a href="http://www.bioconductor.org/packages/devel/bioc/html/Gviz.html">Gviz</a>:&nbsp;Plotting data and annotation information along genomic coordinates <a href="http://bioconductor.org/packages/release/bioc/html/HilbertVis.html">HilbertVis</a>: Hilbert genome plots</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/GenomeGraphs.html">GenomeGraphs</a>: Plotting genomic information from Ensembl</p><p><a href="http://www.hubmed.org/display.cgi?uids=18507856">TileQC</a>: Flow Cell Quality Visualization</p><p><a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">rtracklayer</a>: R interface to genome browsers</p><p><a href="http://genoplotr.r-forge.r-project.org/">genoPlotR</a>: Plotting maps of genes and genomes</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a>: Tools for storing, accessing, analyzing and visualizing genomic data.</p><p>&nbsp;</p><p>To install all packages</p><blockquote><p>source("http://bioconductor.org/biocLite.R")<br />biocLite()<br />biocLite(c("ShortRead", "Biostrings", "IRanges", "BSgenome", "rtracklayer", "biomaRt", "chipseq", "ChIPpeakAnno", "Rsamtools", "BayesPeak", "PICS", "GenomicRanges", "DESeq", "edgeR", "leeBamViews", "GenomicFeatures", "BSgenome.Celegans.UCSC.ce2"))</p></blockquote></div>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>

</channel>
</rss>