R and Bioconductor community

- Discussion
- R and Bioconductor
- Which are the best statistical programming languages to study for a bioinformatician?

- Public

Started by Jitendra Narayan 1561 days ago Replies (8)

In Bio-informatics based genome sequencing and predicting metabolic pathways research jobs I used Matlab, SAS, SPSS, R and several Bioconductor packages. Matlab had a lot of powerful tools and was easy to use, whereas SPSS is for non-programmers and R need programming skills. I am wondering what other people think is best? or there might not be one specific language but a few that lend themselves best to Bio-informatics work that is math heavy and deals with a large amount of data.

Related items

1084 days ago

JitNews

1454 days ago

JitNews

1517 days ago

Jitendra Narayanbookmark

350 days ago

Neelam JhaFile

1242 days ago

Pragati Singhbookmark

1063 days ago

Rahul NayakNews

1489 days ago

Madhvan ReddyResearchLabs posts

1234 days ago

John Parkertop level page

- Neelam Jha 1516 days ago
Phylogenetics in R

R in Ecology and Evolution – http://r-eco-evo.blogspot.com.au/ R bloggers: Recology http://www.r-bloggers.com/author/recology-r/ Talk introducing phylogenetics in R: http://www.r-bloggers.com/my-talk-on-doing-phylogenetics-in-r-2/ Finding meaningful clusters in trees phytools blog, “Phylogenetic Tools for Comparative Biology”

- Jit 1493 days ago
R is by far the best known open source statistical programming language for bioinformatician. However, you can not ignore MATLAB Bioinformatics Toolbox.

I am a big fan of Perl, and love to do all sort of analysis using Perl, therefore I prefer PDL ("Perl Data Language"), which gives standard Perl the ability to compactly store and speedily manipulate the large N-dimensional data arrays which are the bread and butter of scientific computing.

- Rahul Nayak 1493 days ago
Who cares which language is the more popular, programming languages are tools, if it does what I need it to do, it's fine by me.

- Abhimanyu Singh 1329 days ago
Lisp-Stat is an extensible environment for statistical computing and dynamic graphics based on the Lisp language. XLISP-STAT is a version of Lisp-Stat based on a dialect of Lisp called XLISP.

http://homepage.stat.uiowa.edu/~luke/xls/xlsinfo/xlsinfo.html

- John Parker 1161 days ago
I like the R language. The following table comparing the statistical capabilities of software packages: http://stanfordphd.com/Statistical_Software.html In stastistical language war, a/c to this metric, R wins

**TYPE OF STATISTICAL ANALYSIS****R****MATLAB****SAS****STATA****SPSS****Nonparametric Tests**Yes

Yes

Yes

Yes

Yes

**T-test**Yes

Yes

Yes

Yes

Yes

**ANOVA & MANOVA**Yes

Yes

Yes

Yes

Yes

**ANCOVA & MANCOVA**Yes

Yes

Yes

Yes

Yes

**Linear Regression**Yes

Yes

Yes

Yes

Yes

**Generalized Least Squares**Yes

Yes

Yes

Yes

Yes

**Ridge Regression**Yes

Yes

Yes

**Lasso**Yes

Yes

Yes

**Generalized Linear Models**Yes

Yes

Yes

Yes

Yes

**Mixed Effects Models**Yes

Yes

Yes

Yes

Yes

**Logistic Regression**Yes

Yes

Yes

Yes

Yes

**Nonlinear Regression**Yes

Yes

Yes

**Discriminant Analysis**Yes

Yes

Yes

Yes

Yes

**Nearest Neighbor**Yes

Yes

Yes

Yes

**Factor & Principal Components Analysis**Yes

Yes

Yes

Yes

Yes

**Copula Models**Yes

Yes

Experimental

**Cross-Validation**Yes

Yes

Yes

**Bayesian Statistics**Yes

Yes

Limited

**Monte Carlo, Classic Methods**Yes

Yes

Yes

Yes

Limited

**Markov Chain Monte Carlo**Yes

Yes

Yes

**Bootstrap & Jackknife**Yes

Yes

Yes

Yes

**EM Algorithm**Yes

Yes

Yes

**Missing Data Imputation**Yes

Yes

Yes

Yes

Yes

**Outlier Diagnostics**Yes

Yes

Yes

Yes

Yes

**Robust Estimation**Yes

Yes

Yes

Yes

**Longitudinal (Panel) Data**Yes

Yes

Yes

Yes

Limited

**Survival Analysis**Yes

Yes

Yes

Yes

Yes

**Path Analysis**Yes

Yes

Yes

**Propensity Score Matching**Yes

Yes

Limited

Limited

**Stratified Samples (Survey Data)**Yes

Yes

Yes

Yes

Yes

**Experimental Design**Yes

Yes

**Quality Control**Yes

Yes

Yes

Yes

**Reliability Theory**Yes

Yes

Yes

Yes

Yes

**Univariate Time Series**Yes

Yes

Yes

Yes

Limited

**Multivariate Time Series**Yes

Yes

Yes

Yes

**Markov Chains**Yes

Yes

**Hidden Markov Models**Yes

Yes

**Stochastic Volatility Models**Yes

Yes

Limited

Limited

Limited

**Diffusions**Yes

Yes

**Counting Processes**Yes

Yes

Yes

**Filtering**Yes

Yes

Limited

Limited

**Instrumental Variables**Yes

Yes

Yes

Yes

**Simultaneous Equations**Yes

Yes

Yes

Yes

**Splines**Yes

Yes

Yes

Yes

**Nonparametric Smoothing Methods**Yes

Yes

Yes

Yes

**Extreme Value Theory**Yes

Yes

**Variance Stabilization**Yes

Yes

**Cluster Analysis**Yes

Yes

Yes

Yes

Yes

**Neural Networks**Yes

Yes

Yes

Limited

**Classification & Regression Trees**Yes

Yes

Yes

Limited

**Boosting Classification & Regression Trees**Yes

Yes

**Random Forests**Yes

Yes

**Support Vector Machines**Yes

Yes

Yes

**Signal Processing**Yes

Yes

**Wavelet Analysis**Yes

Yes

Yes

**ROC Curves**Yes

Yes

Yes

Yes

Yes

**Optimization**Yes

Yes

Yes

Limited

- Neelam Jha 1155 days ago
R Passes SPSS in Scholarly Use, Stata Growing Rapidly http://www.r-bloggers.com/r-passes-spss-in-scholarly-use-stata-growing-rapidly/

- Jitendra Narayan 1017 days ago
The recent article on Nature explain it better ... R becoming the most popular language amongst biological researchers http://www.nature.com/news/programming-tools-adventures-with-r-1.16609?

- Rahul Nayak 874 days ago
A nice comparision between R, SAS and Python http://www.datasciencecentral.com/forum/topics/which-one-is-best-r-sas-or-python-for-data-science