<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/3046?offset=0</link>
	<atom:link href="https://bioinformaticsonline.com/related/3046?offset=0" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21365/a-guide-for-complete-r-beginners</guid>
	<pubDate>Fri, 20 Feb 2015 23:36:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21365/a-guide-for-complete-r-beginners</link>
	<title><![CDATA[A guide for complete R beginners !]]></title>
	<description><![CDATA[<p>This tutorial is intended to introduce users quickly to the basics of R, focusing on a few common tasks that &nbsp;biologists need to perform &nbsp;some basic analysis: &nbsp;load a table, plot some graphs, and perform some basic statistics. More extensive tutorials can be found on the project website and via bioconductor (not covered here).</p><p><em><span style="text-decoration: underline;">R-language: </span></em><a href="http://www.r-project.org/"><span style="color: #000080;"><span style="text-decoration: underline;"><em>http://www.</em></span></span><span style="color: #000080;"><span style="text-decoration: underline;"><em><strong>r</strong></em></span></span><span style="color: #000080;"><span style="text-decoration: underline;"><em>-project.org</em></span></span></a></p><p><em>BioConductor</em>:&nbsp;<a href="http://www.bioconductor.org/">http://www.bioconductor.org</a></p><p><strong>Advantages of R</strong></p><ul>
<li>Free!</li>
<li>Powerful, many libraries have been created to perform application specific tasks. e.g. analysis of microarray experiments and Next-Gen sequencing (bioconductor: including Bioseq group).</li>
<li>Presentation quality graphics
<ul>
<li>Save as a png, pdf or svg</li>
</ul>
</li>
<li>History
<ul>
<li>What you do can be saved for the next time you use R.</li>
<li>Ability to turn it into an automated script to perform again and again on different data</li>
</ul>
</li>
</ul><p><strong>Disadvantages</strong></p><ul>
<li>Lack of a comprehensive graphical user interface, but two do exist: However some do exist:&nbsp;R commander: <a href="http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/">http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/</a> and&nbsp;Limma-gui (microarrays) : <a href="http://bioinf.wehi.edu.au/limmaGUI/">http://bioinf.wehi.edu.au/limmaGUI/</a></li>
</ul><p><strong>Preparation</strong></p><ul>
<li>(Optional) Download and save the tutorial data set from
<ul>
<li>http://bioinformatics.knowledgeblog.org/wp-content/uploads/bioinf/kerr/data.tsv</li>
<li>Start R (type R on a Linux or Mac terminal, or find the starting link from PC)</li>
</ul>
</li>
</ul><p><strong>Getting More Help</strong></p><ul>
<li>Project Home page
<ul>
<li><span style="color: #000080;"><span style="text-decoration: underline;"><a href="http://www.r-project.org/">http://www.r-project.org/</a></span></span></li>
<li>Check out the &lsquo;introduction to R&rsquo;, which is a much more in depth guide .</li>
<li>Also R has a built-in help system (see later)</li>
</ul>
</li>
</ul><p><strong>Working directory</strong></p><p>This is the directory used to store your data and results. It is useful if it is also the directory where your input data is stored.</p><ul>
<li>Mac/Linux: this is the directory where you typed in R</li>
<li>PC: Change using the change working directory option</li>
</ul>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27225/painless-package-development-for-r</guid>
	<pubDate>Tue, 03 May 2016 05:31:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27225/painless-package-development-for-r</link>
	<title><![CDATA[Painless package development for R]]></title>
	<description><![CDATA[<p>Devtools makes package development a breeze: it works with R&rsquo;s existing conventions for code structure, adding efficient tools to support the cycle of package development. With devtools, developing a package becomes so easy that it will be your default layout whenever you&rsquo;re writing a significant amount of code.</p>
<p>Before you get started be sure to check out:</p>
<ul>
<li><a href="https://groups.google.com/forum/#%21forum/rdevtools" title="Google devtools Group">devtools Google Group &ndash;&nbsp;https://groups.google.com/forum/#!forum/rdevtools</a></li>
<li><a href="http://adv-r.had.co.nz/" title="Hadley W Online Book">book on &ldquo;Advanced R programming&rdquo; &ndash;&nbsp;http://adv-r.had.co.nz/</a></li>
<li><a href="https://github.com/hadley/devtools" title="devtools GitHub">GitHub repository &ndash;&nbsp;https://github.com/hadley/devtools</a></li>
</ul>
<h3 id="getting_started">&nbsp;</h3><p>Address of the bookmark: <a href="https://www.rstudio.com/products/rpackages/devtools/" rel="nofollow">https://www.rstudio.com/products/rpackages/devtools/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/29601/statistics-using-r-with-biological-examples</guid>
	<pubDate>Thu, 03 Nov 2016 04:55:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/29601/statistics-using-r-with-biological-examples</link>
	<title><![CDATA[Statistics Using R   with Biological Examples]]></title>
	<description><![CDATA[<p>This book is a manifestation of my desire to teach researchers in biology a bit more about statistics than an ordinary introductory course covers and to introduce the utilization of R as a tool for analyzing their data. My goal is to reach those with little or no training in higher level statistics so that they can do more of their own data analysis, communicate more with statisticians, and appreciate the great potential statistics has to offer as a tool to answer biological questions. </p><p>This is necessary in light of the increasing use of higher level statistics in biomedical research. I hope it accomplishes this mission and encourage its free distribution and use as a course text or supplement.</p><p>K Seefeld, May 2007</p>]]></description>
	<dc:creator>Neel</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/29601" length="4581031" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/21312/r-for-microsoft-excel</guid>
	<pubDate>Wed, 18 Feb 2015 00:43:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/21312/r-for-microsoft-excel</link>
	<title><![CDATA[R for Microsoft Excel]]></title>
	<description><![CDATA[<div><p>If you currently use a spreadsheet like Microsoft Excel for data analysis, you might be interested in taking a look at this <a href="https://districtdatalabs.silvrback.com/intro-to-r-for-microsoft-excel-users" target="_blank">tutorial on how to transition from Excel to R</a>&nbsp;by Tony Ojeda. The tutorial explains how to use R functions in place of Excel formulas, including tools like =AVERAGE and =VLOOKUP. For the most part, it uses modern R packages to keep the R code clear and concise.</p><p>You'll likely still be using Excel as a data source, though, so you'll also want to check out this <a href="http://www.milanor.net/blog/?p=779" target="_blank">guide to importing data from Excel to R</a> from MilanoR.</p></div><p>Reference http://www.r-bloggers.com/an-r-tutorial-for-microsoft-excel-users/</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21367/a-guide-for-complete-r-beginners-r-syntax</guid>
	<pubDate>Fri, 20 Feb 2015 23:41:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21367/a-guide-for-complete-r-beginners-r-syntax</link>
	<title><![CDATA[A guide for complete R beginners :- R Syntax]]></title>
	<description><![CDATA[<p>R is a functional based language, the inputs to a function, including options, are in brackets. Note that all dat and options are separated by a comma</p><ul>
<li>Function(data, options)</li>
</ul><p>Even quit is a function</p><ul>
<li>q()</li>
</ul><p>So is help</p><blockquote><p><strong>help(read.table)</strong></p></blockquote><p>Provides the help page for the FUNCTION &lsquo;read.table&rsquo;</p><blockquote><p><strong>help.search(&ldquo;t test&rdquo;)</strong></p></blockquote><p>Searches for help pages that might relate to the phrase &lsquo;t test&rsquo;</p><p><strong>NOTE</strong>: quotes are needed for search strings, they are not needed when referring to data objects or function names.</p><p>There is a short cut for help,</p><p>? shows the help page on a function name, same as <em>help(function)</em></p><blockquote><p><strong>?read.table</strong></p></blockquote><p>?? searches for help pages on functions, same as <em>help.search(&lsquo;phrase&rsquo;)</em></p><blockquote><p><strong>??&ldquo;t test&rdquo;</strong></p></blockquote><p>Information is usually returned from a function, by default this is printed to screen</p><blockquote><p><strong>read.table(&lsquo;data.tsv&rsquo;)</strong></p></blockquote><p>This can always be stored, we call what it is stored in an &lsquo;object&rsquo;</p><p><strong>mydata </strong></p><p>here <strong>mydata</strong> is an object of type <span style="text-decoration: underline;">dataframe</span></p><p><strong>Reminder:</strong></p><ul>
<li>Vector: a list of numbers, equivalent to a column in a table</li>
<li>Data Frame = a collection of vectors. Equivalent to a table</li>
</ul><p><strong>Hint</strong>:</p><ul>
<li>Up/Down arrow keys can be use to cycle through previous commands</li>
</ul>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21443/a-guide-for-complete-r-beginners-getting-data-into-r</guid>
	<pubDate>Tue, 24 Feb 2015 20:15:08 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21443/a-guide-for-complete-r-beginners-getting-data-into-r</link>
	<title><![CDATA[A guide for complete R beginners :- Getting data into R]]></title>
	<description><![CDATA[<p>For a beginner this can be is the hardest part, it is also the most important to get right.</p><p>It is possible to create a vector by typing data directly into R using the combine function &lsquo;c&rsquo;</p><blockquote><p><strong>x </strong></p></blockquote><p>same as</p><blockquote><p><strong>x </strong></p></blockquote><p>creates the vector x with the numbers between 1 and 5.</p><p>You can see what is in an object at any time by typing its name;</p><blockquote><p><strong>x</strong></p></blockquote><p>will produce the output<strong> &lsquo;[1] 1 2 3 4 5&prime;</strong></p><p>Note that names need to be quoted</p><blockquote><p><strong>daysofweek </strong><strong>&larr; c(&lsquo;Monday&rsquo;, &lsquo;Tuesday&rsquo;, &lsquo;Wednesday&rsquo;, &lsquo;Thursday&rsquo;, &lsquo;Friday&rsquo;);</strong></p></blockquote><p>Usually however you want to input from a file. We have touched on the &lsquo;read.table&rsquo; function already.</p><blockquote><p><strong>mydata </strong></p></blockquote><p>Now <strong>mydata</strong> is a data frame with multiple vectors</p><p>each vector can be identified by the default syntax</p><p>#if any of these are typed it will print to screen</p><blockquote><p><strong>mydata$V1 mydata$V2 mydata$V3 </strong></p></blockquote><p>By default the function assumes certain things from the file</p><ul>
<li>The file is a plain text file (there are function to read excel files: <em>not covered here</em>)</li>
<li>columns are separated by any number of tabs or spaces</li>
<li>there is the same number of data points in each column</li>
<li>there is no header row (labels for the columns)</li>
<li>there is no column with names for the rows** [I&rsquo;ll explain].</li>
</ul><p><span style="text-decoration: underline;">If any of these are false, we need to tell that to the function</span></p><p>If it has a header column</p><blockquote><p><strong>mydata <em>header=T also works</em></strong></p></blockquote><p>Note that there is a comma between different parts of the functions arguments</p><p>If there is one less column in the header row, then R assumes that the 1<sup>st</sup> column of data after the header are the row names</p><p>Now the vectors (columns) are identified by their name</p><p>#if any of these are typed it will print to screen</p><blockquote><p><strong>mydata$A mydata$B mydata$C </strong></p></blockquote><p># Summary about the whole data frame</p><blockquote><p><strong>summary(mydata)</strong></p></blockquote><p># Summary information of column A</p><blockquote><p><strong>summary(mydata$A) </strong></p></blockquote><p>We can shortcut having to type the data frame each time by attaching it</p><blockquote><p><strong>attach(mydata)</strong></p></blockquote><p># summary of column B as &lsquo;mydata&rsquo; is attached</p><blockquote><p><strong>summary(B)</strong></p></blockquote><p><span style="text-decoration: underline;">Two other important options for </span><em><span style="text-decoration: underline;">read.table</span></em></p><p>If is is separated only by tabs and has a header</p><blockquote><p><strong>mydata </strong></p></blockquote><p>Really useful if you have spaces in the contents of some columns, so R does not mess up reading the columns . However if the columns or of an uneven length it will tell you.</p><p>If you know that the file has uneven columns</p><blockquote><p><strong>mydata </strong></p></blockquote><p>This causes R to fill empty spaces in a columns with &lsquo;NA&rsquo; .</p><p>The last two examples will still work with our file and give the same result as with only headers=T</p><p><span style="text-decoration: underline;">Graphs</span></p><p>to get an idea of what R is capable of type</p><blockquote><p><strong>demo(graphics)</strong></p></blockquote><p>steps through the examples, and the code is printed to the screen</p><p>We will work with simpler examples that have immediate use to biologists.</p><p>Remember to get more information about the options to a function type &lsquo;?function&rsquo;</p><p><span style="text-decoration: underline;">Histogram of A</span><span style="text-decoration: underline;"></span></p><blockquote><p><strong>hist(mydata$A)</strong></p></blockquote><p>If there was more data we could increase the number of vertical columns with the option, breaks=50 (or another relevant number).</p><blockquote><p><strong>boxplot(mydata)</strong></p></blockquote><p>We can get rid of the need to type the data frame each time by using the <strong>attach</strong> function</p><p># if not already done so</p><blockquote><p><strong>attach(mydata) </strong></p><p><strong>boxplot(mydata$A, mydata$B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p>same as</p><blockquote><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p><span style="text-decoration: underline;">Scatter plot</span></p><p># if not already done so</p><blockquote><p><strong>attach(mydata) </strong></p><p><strong>plot(A,B) # or plot(mydata$A, mydata$B)</strong></p></blockquote><p><strong><span style="text-decoration: underline;">SAVING an image</span></strong></p><p>Windows users (Rgui) RIGHT click on image and select which you want.</p><p><span style="text-decoration: underline;">These instructions work for everyone.</span></p><p>You need to create a new device of the type of file you need, then send the data to that device</p><p>to save as a png file (easy to load into the likes of powerpoint, also great for web applications.</p><blockquote><p><strong>png(&lsquo;filename&rsquo;) </strong></p><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p>or to save as a pdf</p><blockquote><p><strong>pdf(&lsquo;filename&rsquo;) </strong></p><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p><span style="text-decoration: underline;">Note</span></p><ul>
<li>Nothing will appear on screen, the output is going to the file</li>
<li>Also it may not be saved immediately but will once the device (or R) is turned quit.</li>
</ul><p>To quit R type</p><p><strong>q() # </strong>If you save your session, next time you start R, you will have your data preloaded.</p><p>Or if you want to remain in R</p><blockquote><pre><strong>dev.off() #</strong>turns of the png (or pdf etc) device, thus forces the data to save</pre></blockquote>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/23160/opencpu</guid>
	<pubDate>Sun, 05 Jul 2015 18:34:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/23160/opencpu</link>
	<title><![CDATA[OpenCPU]]></title>
	<description><![CDATA[<p>OpenCPU is a system for embedded scientific computing and reproducible research. The OpenCPU server provides a reliable and interoperable <a href="https://www.opencpu.org/api.html">HTTP API</a> for data analysis based on R.</p><p>The OpenCPU <a href="https://www.opencpu.org/jslib.html">JavaScript client library</a> provides the most seamless integration of R and JavaScript available today.</p><p>OpenCPU uses standard R packaging to develop, ship and deploy web applications. Several open source <a href="https://www.opencpu.org/apps.html">example apps</a> are available from Github.</p><p>Installing your own OpenCPU server is <a href="https://www.opencpu.org/download.html">super easy</a> and only takes a few minutes.</p><p>More at https://www.opencpu.org/</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27070/venn-diagrams-on-r-studio</guid>
	<pubDate>Mon, 25 Apr 2016 16:22:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27070/venn-diagrams-on-r-studio</link>
	<title><![CDATA[Venn Diagrams on R Studio]]></title>
	<description><![CDATA[<h3>First step: Install &amp; load &ldquo;VennDiagram&rdquo; package.</h3>
<pre><code><span># install.packages('VennDiagram')</span>
<span>library</span><span>(</span><span>VennDiagram</span><span>)</span>
</code></pre>
<h3>Second step: Load data</h3>
<p>Add filepath if &ldquo;catdoge.csv&rdquo; is not in working-directory.</p>
<pre><code><span>d</span> <span>&lt;-</span> <span>read.csv</span><span>(</span><span>"catdoge.csv"</span><span>)</span></code><br><br></pre><p>Address of the bookmark: <a href="http://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html" rel="nofollow">http://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html</a></p>]]></description>
	<dc:creator>Jitendra Prajapati</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</guid>
	<pubDate>Mon, 02 Jun 2014 18:03:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</link>
	<title><![CDATA[Next generation sequencing in R or bioconductor environment]]></title>
	<description><![CDATA[<p>There are many R software and bioconductor packages for NGS data analysis, some of them are as follows</p><h3><a name="TOC-Biostrings" id="TOC-Biostrings"></a>Biostrings</h3><p>The Biostrings package from Bioconductor provides an advanced environment for efficient sequence management and analysis in R. It contains many speed and memory effective string containers, string matching algorithms, and other utilities, for fast manipulation of large sets of biological sequences. The objects and functions provided by Biostrings form the basis for many other sequence analysis packages. <a href="http://bioconductor.org/packages/release/bioc/html/Biostrings.html">Documentation</a></p><div><div style="text-align: left;"><div style="color: #000000;"><h4><a name="TOC-IRanges-Overview" id="TOC-IRanges-Overview"></a>IRanges Overview</h4><p>IRanges provides the low-level infrastructure and containers for handling sets of integer ranges within Bioconductor's BioC-Seq domain. Its classes and methods provide support for many more high-level packages like GenomicRanges, ShortRead, Rsamtools, etc. <a href="http://bioconductor.org/packages/release/bioc/html/IRanges.html">Documentation</a></p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-GenomicRanges-Overview" id="TOC-GenomicRanges-Overview"></a>GenomicRanges Overview</h4><p>The <em>GenomicRanges</em> package serves as the foundation for representing genomic locations within the Bioconductor project. It is built upon the <em>IRanges</em> infrastructure and defines three major data containers - <em>GRanges, GRangesList</em> and <em>GappedAlignments</em> - which are supporting other important BioC-Seq packages including <em>ShortRead, Rsamtools, rtracklayer, GenomicFeatures</em> and <em>BSgenome</em>.&nbsp; Compared to the IRanges container, the GRanges/<em>GRangesList</em> classes are more flexible and extensible to store additional information about sequence ranges, such as chromosome identifiers (sequence space), strand information and annotation data. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p></div></div></div></div><h3><a name="TOC-Motif-Discovery" id="TOC-Motif-Discovery"></a>Motif Discovery</h3><h4><a name="TOC-cosmo" id="TOC-cosmo"></a>cosmo</h4><p>The cosmo package allows to search a set of unaligned DNA sequences for a shared motif that may function as transcription factor binding site. The algorithm extends the popular motif discovery tool MEME (Bailey and Elkan, 1995) in that it allows the search to be supervised by specifying a set of constraints that the motif to be discovered must satisfy. <a href="http://bioconductor.org/packages/release/bioc/html/cosmo.html">Documentation</a></p></div><div>
<p><span></span><span></span></p>
<div style="color: #0000ff;"><h4><a name="TOC-BCRANK" id="TOC-BCRANK"></a>BCRANK</h4><p>BCRANK is a method that takes a ranked list of genomic regions as input and outputs short DNA sequences that are overrepresented in some part of the list. The algorithm was developed for detecting transcription factor (TF) binding sites in a large number of enriched regions from high-throughput ChIP-chip or ChIP-seq experiments, but it can be applied to any ranked list of DNA sequences. Documentation</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/BCRANK.html"></a></p>
<p>rGADEM: <a href="http://bioconductor.org/packages/devel/bioc/html/rGADEM.html">Documentation</a></p><p>MotIV: <a href="http://bioconductor.org/packages/devel/bioc/html/MotIV.html">Documentation</a></p></div><h3><a name="TOC-ShortRead" id="TOC-ShortRead"></a>ShortRead</h3><p>The ShortRead package provides input, quality control, filtering, parsing, and manipulation functionality for short read sequences produced by high throughput sequencing technologies. While support is provided for many sequencing technologies, this package is primairly focused on Solexa/Illumina reads. <a href="http://bioconductor.org/packages/release/bioc/html/ShortRead.html">Documentation</a></p><h3><a name="TOC-Rsamtools" id="TOC-Rsamtools"></a>Rsamtools</h3><p>Rsamtools provides functions for parsing and inspecting samtools BAM formatted binary alignment data. SAM/BAM is quickly becoming a universal standard alignment format, and is now supported by a wide variety of alignment tools. <a href="http://bioconductor.org/help/bioc-views/2.7/bioc/html/Rsamtools.html">Documentation</a></p>
<p><a href="http://samtools.sourceforge.net/">Samtools Website</a><br /> <a href="http://bio-bwa.sourceforge.net/">BWA (Burrows-Wheeler Alignment) Website</a><br /><span style="color: #0000ff;"></span></p>
<div style="color: #000000;">&nbsp;</div></div><div>
<p><span style="color: #000000;">Additional tools for SNP analysis:&nbsp;</span></p>
<p><a href="http://bioconductor.org/help/bioc-views/release/bioc/html/snpMatrix.html">snpMatrix</a></p><h3><a name="TOC-BSgenome" id="TOC-BSgenome"></a>BSgenome</h3><p>BSgenome provides an object oriented infrastructure for interacting with a Biostring based genome sequence. BSgenome packages exist for many common genomes, and can be created to represent custom genomes. See the "How to forge a BSgenome data package" Vignette for instructions to create a new BSgenome package if a prebuilt package does not exist for your organism. <a href="http://bioconductor.org/packages/release/bioc/html/BSgenome.html">Documentation</a></p><h3><a name="TOC-rtracklayer" id="TOC-rtracklayer"></a>rtracklayer</h3><p>rtracklayer provides an interface for exporting annotation feature data to various genome browsers and file formats (such as GFF). See the Small RNA Profiling exercise for an example of using rtracklayer to visualize alignment coverage. <a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">Documentation</a></p><h3><a name="TOC-biomaRt" id="TOC-biomaRt"></a>biomaRt</h3><p>The biomaRt package, provides an interface to a growing collection of databases implementing the BioMart software suite (http:// www.biomart.org). The package enables online retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas. This data is retrieved automatically via the Internet, so it's recommended that you cache the data locally, or check versions if your code will be adversely affected by updates to these data. <a href="http://bioconductor.org/packages/release/bioc/html/biomaRt.html">Documentation</a></p><h3><a name="TOC-ChIP-Seq-Analysis-Packages" id="TOC-ChIP-Seq-Analysis-Packages"></a>ChIP-Seq Analysis Packages</h3><p>Bioconductor provides various packages for analyzing and visualizing ChIP-Seq data. Only a small selection of these packages is introduced here. Additional useful introductions to this topic are: <a href="http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/">BioC ChIP-seq Case Study</a> and BioC <a href="http://www.bioconductor.org/help/course-materials/2009/SeattleNov09/ChIP-seq/">ChIP-Seq</a>.</p><h4><a name="TOC-chipseq" id="TOC-chipseq"></a>chipseq</h4><p>The chipseq package combines a variety of HT-Seq packages to a pipeline for ChIP-Seq data analysis. <a href="http://bioconductor.org/packages/release/bioc/html/chipseq.html">Documentation</a></p><h4><a name="TOC-BayesPeak" id="TOC-BayesPeak"></a>BayesPeak</h4><p>BayesPeak is a peak calling package for identifying DNA binding sites of proteins in ChIP-Seq experiments. Its algorithm uses hidden Markov models (HMM) and Bayesian statistical methods. The following sample code introduces the identification of peaks with the BayesPeak package as well as the incorporation of read coverage information obtained by the chipseq package. <a href="http://bioconductor.org/packages/release/bioc/html/BayesPeak.html">Documentation</a> [ <a href="http://www.biomedcentral.com/1471-2105/10/299">Publication</a> ]</p><h4><a name="TOC-PICS" id="TOC-PICS"></a>PICS</h4><p>The PICS package applies probabilistic inference to aligned-read ChIP-Seq data in order to identify regions bound by transcription factors. PICS identifies enriched regions by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model. The following sample code uses the test data set from the above BayesPeak package in order to compare the results from both methods by identifying their consensus peak set. <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">Documentation</a> [ <a href="http://www.hubmed.org/display.cgi?uids=20528864">Publication</a> ]</p><h4><a name="TOC-ChIPpeakAnno" id="TOC-ChIPpeakAnno"></a>ChIPpeakAnno</h4><p>The ChIPpeakAnno package provides. batch annotation of the peaks identified from either ChIP-seq or ChIP-chip experiments. It includes functions to retrieve the sequences around peaks, obtain enriched Gene Ontology (GO) terms, find the nearest gene, exon, miRNA or custom features such as most conserved elements and other transcription factor binding sites supplied by users. The package leverages the biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest and stat packages. <a href="http://bioconductor.org/packages/release/bioc/html/ChIPpeakAnno.html">Documentation</a></p><h4><a name="TOC-Additional-ChIP-Seq-Packages" id="TOC-Additional-ChIP-Seq-Packages"></a>Additional ChIP-Seq Packages</h4><p>DiffBind: <a href="http://www.bioconductor.org/packages/release/bioc/html/DiffBind.html">Documentation</a></p><p>MOSAICS: <a href="http://bioconductor.org/packages/devel/bioc/html/mosaics.html">Documentation</a></p><p>iSeq: <a href="http://bioconductor.org/packages/release/bioc/html/iSeq.html">Documentation</a></p><p>ChIPseqR: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPseqR.html">Documentation</a></p><p>ChiPsim: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPsim.html">Documentation</a></p><p>CSAR: <a href="http://www.bioconductor.org/packages/devel/bioc/html/CSAR.html">Documentation</a></p><p>ChIP-Seq Pipeline: <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">PICS</a>, rGADEM and MotIV (<a href="http://www.rglab.org/pics-and-bioconductor/">developer web site</a>)</p><p>SPP: <a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/">ChIP-seq processing pipeline</a></p><p><a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/tutorial.html">SPP Tutorial</a></p><p><a href="http://liulab.dfci.harvard.edu/MACS/index.html">MACS</a></p><p><a href="http://gmdd.shgmo.org/Computational-Biology/ChIP-Seq/download/SIPeS">SIPeS</a></p><h3><a name="TOC-RNA-Seq-Analysis" id="TOC-RNA-Seq-Analysis"></a>RNA-Seq Analysis</h3><h4><a name="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-" id="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-"></a>Counting Reads that Overlap with Annotation Ranges&nbsp;</h4><p>The GenomicRanges package provides support for importing into R short read alignment data in BAM format (via Rsamtools) and associating them with genomic feature ranges, such as exons or genes. This way one can quantify the number of reads aligning to annotated genomic regions. The package defines general purpose containers for storing genomic intervals as well as more specialized containers for storing alignments against a reference genome. The two main functions for read counting provided by this infrastructure are <span>countOverlaps <span style="color: #000000;"><span>and</span></span> summarizeOverlaps</span>. For their proper usage, it is important to read the corresponding <a href="http://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf">PDF manual</a>. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-DESeq" id="TOC-Differential-Gene-Expression-Analysis-with-DESeq"></a>Differential Gene Expression Analysis with DESeq</h4><p>The DESeq package contains functions to call differentially expressed genes (DEGs) in count tables based on a model using the negative binomial distribution. It expects as input a data frame with the raw read counts per region/gene of interest (rows) for each test sample (columns).&nbsp; Such a count table can be imported into R or generated from BAM alignment files using the <span>countOverlaps</span> function as introduced above. <a href="http://www.bioconductor.org/packages/release/bioc/html/DESeq.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-edgeR" id="TOC-Differential-Gene-Expression-Analysis-with-edgeR"></a>Differential Gene Expression Analysis with edgeR</h4><p>The edgeR package uses empirical Bayes estimation and exact tests based on the negative binomial distribution to call differentially expressed genes (DEGs) in count data.&nbsp;</p>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/edgeR.html">Documentation</a></p>
<p><span style="color: #000000;">A variety of additional R packages are available for normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG): <br /> </span></p><p><a href="http://bioconductor.org/packages/devel/bioc/html/easyRNASeq.html">easyRNASeq</a> (simplifies read counting per genome feature)</p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html">DEXSeq</a> (Inference of differential exon usage);&nbsp;<a href="http://www.bioconductor.org/packages/release/data/experiment/html/parathyroidSE.html">parathyroidSE</a> explains how to generate exon read counts in R</p><p><a href="http://bioconductor.org/packages/release/bioc/html/DEGseq.html">DEGseq</a></p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/baySeq.html">baySeq</a> (also see: <a href="http://www.bioconductor.org/packages/release/bioc/html/segmentSeq.html">segmentSeq</a>)</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a> (<a href="http://www.hubmed.org/display.cgi?uids=20167110">Bullard et al. 2010</a>)</p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-Detection-of-Alternative-Splice-Junctions" id="TOC-Detection-of-Alternative-Splice-Junctions"></a>Detection of Alternative Splice Junctions</h4>
<p><span style="color: #000000;">Another utility of RNA-Seq experiments is the analysis of splice junctions. The following software suggestions provide this utility:</span></p>
<p><a href="http://woldlab.caltech.edu/rnaseq/">ERANGE<br /> </a><a href="http://tophat.cbcb.umd.edu/">TopHat</a></p><p><a href="http://biogibbs.stanford.edu/%7Ekinfai/SpliceMap/">SpliceMap</a></p><p><a href="http://solidsoftwaretools.com/gf/project/splitseek/">SplitSeek</a></p><h3><a name="TOC-DNA-Methylation-Data-Analysis" id="TOC-DNA-Methylation-Data-Analysis"></a>DNA-Methylation Data Analysis</h3><div><ul>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/help/course-materials/2012/BiocEurope2012/mattia_pelizzola_methylPipe.pdf">methylPipe</a></span></li>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/packages/devel/bioc/html/bsseq.html">bsseq</a></span></li>
<li><a href="http://www.bioconductor.org/packages/devel/bioc/html/BiSeq.html">BiSeq</a></li>
<li>Much more under <a href="http://www.bioconductor.org/packages/devel/BiocViews.html#___DNAMethylation">BiocViews</a></li>
</ul></div></div></div><h3><a name="TOC-HT-Seq-Data-Visualization" id="TOC-HT-Seq-Data-Visualization"></a>HT-Seq Data Visualization</h3>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/ggbio.html">ggbio</a>: ggplot2 extension for genomics data (<a href="http://tengfei.github.com/ggbio/">online manual</a>) <a href="http://www.bioconductor.org/packages/devel/bioc/html/Gviz.html">Gviz</a>:&nbsp;Plotting data and annotation information along genomic coordinates <a href="http://bioconductor.org/packages/release/bioc/html/HilbertVis.html">HilbertVis</a>: Hilbert genome plots</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/GenomeGraphs.html">GenomeGraphs</a>: Plotting genomic information from Ensembl</p><p><a href="http://www.hubmed.org/display.cgi?uids=18507856">TileQC</a>: Flow Cell Quality Visualization</p><p><a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">rtracklayer</a>: R interface to genome browsers</p><p><a href="http://genoplotr.r-forge.r-project.org/">genoPlotR</a>: Plotting maps of genes and genomes</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a>: Tools for storing, accessing, analyzing and visualizing genomic data.</p><p>&nbsp;</p><p>To install all packages</p><blockquote><p>source("http://bioconductor.org/biocLite.R")<br />biocLite()<br />biocLite(c("ShortRead", "Biostrings", "IRanges", "BSgenome", "rtracklayer", "biomaRt", "chipseq", "ChIPpeakAnno", "Rsamtools", "BayesPeak", "PICS", "GenomicRanges", "DESeq", "edgeR", "leeBamViews", "GenomicFeatures", "BSgenome.Celegans.UCSC.ce2"))</p></blockquote></div>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21444/a-guide-for-complete-r-beginners-installing-r-packages</guid>
	<pubDate>Tue, 24 Feb 2015 20:23:34 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21444/a-guide-for-complete-r-beginners-installing-r-packages</link>
	<title><![CDATA[A guide for complete R beginners :- Installing R packages]]></title>
	<description><![CDATA[<p>Part of the reason R has become so popular is the vast array of packages available at the <a href="http://cran.r-project.org/" target="_blank">cran</a> and <a href="http://www.bioconductor.org/" target="_blank">bioconductor</a> repositories. In the last few years, the number of packages has grown <a href="http://blog.revolutionanalytics.com/2010/09/what-can-other-languages-learn-from-r.html" target="_blank">exponentially</a>!</p><p>This is a short post giving steps on how to actually install R packages. Let&rsquo;s suppose you want to install the <a href="http://had.co.nz/ggplot2/" target="_blank">ggplot2</a> package. Well nothing could be easier. We just fire up an R shell and type:<br /><code><br />&gt; install.packages("ggplot2")</code></p><p>In theory the package should just install, however:</p><ul>
<li>if you are using Linux and don&rsquo;t have root access, this command won&rsquo;t work.</li>
<li>you will be asked to select your local mirror, i.e. which server should you use to download the package.</li>
</ul><h4>Installing packages without root access</h4><p>First, you need to designate a directory where you will store the downloaded packages. On my machine, I use the directory <code>/data/Rpackages/</code> After creating a package directory, to install a package we use the command:<br /><code><br />&gt; install.packages("ggplot2"</code><code>, lib="/data/Rpackages/")<br />&gt; library(ggplot2, lib.loc="/data/Rpackages/")<br /></code></p><p>It&rsquo;s a bit of a pain having to type <code>/data/Rpackages/</code> all the time. To avoid this burden,&nbsp; we create a file <code>.Renviron</code> in our home area, and add the line <code>R_LIBS=/data/Rpackages/</code> to it. This means that whenever you start R, the directory <code>/data/Rpackages/</code> is added to the list of places to look for R packages and so:</p><p><code>&gt; install.packages("ggplot2"</code><code>)<br />&gt; library(ggplot2)</code></p><p>just works!</p><h4>Setting the repository</h4><p>Every time you install a R package, you are asked which repository R should use. To set the repository and avoid having to specify this at every package install, simply:</p><ul>
<li>create a file <code>.Rprofile</code> in your home area.</li>
<li>Add the following piece of code to it:</li>
</ul><p><code><br />cat(".Rprofile: Setting UK repositoryn")<br />r = getOption("repos") # hard code the UK repo for CRAN<br />r["CRAN"] = "http://cran.uk.r-project.org"<br />options(repos = r)<br />rm(r)<br /></code></p><p>I found this tip in a stackoverflow <a href="http://stackoverflow.com/questions/1189759/expert-r-users-whats-in-your-rprofile/1189826#1189826" target="_blank">answer </a>.</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>

</channel>
</rss>