R and Bioconductor community

- Discussion
- R and Bioconductor
- Which are the best statistical programming languages to study for a bioinformatician?

- Public

Started by Jitendra Narayan 1499 days ago Replies (8)

In Bio-informatics based genome sequencing and predicting metabolic pathways research jobs I used Matlab, SAS, SPSS, R and several Bioconductor packages. Matlab had a lot of powerful tools and was easy to use, whereas SPSS is for non-programmers and R need programming skills. I am wondering what other people think is best? or there might not be one specific language but a few that lend themselves best to Bio-informatics work that is math heavy and deals with a large amount of data.

Related items

1023 days ago

JitNews

1393 days ago

JitNews

1456 days ago

Jitendra Narayanbookmark

288 days ago

Neelam JhaFile

1180 days ago

Pragati Singhbookmark

1001 days ago

Rahul NayakNews

1428 days ago

Madhvan ReddyResearchLabs posts

1172 days ago

John Parkertop level page

- Neelam Jha 1455 days ago
Phylogenetics in R

R in Ecology and Evolution – http://r-eco-evo.blogspot.com.au/ R bloggers: Recology http://www.r-bloggers.com/author/recology-r/ Talk introducing phylogenetics in R: http://www.r-bloggers.com/my-talk-on-doing-phylogenetics-in-r-2/ Finding meaningful clusters in trees phytools blog, “Phylogenetic Tools for Comparative Biology”

- Jit 1432 days ago
R is by far the best known open source statistical programming language for bioinformatician. However, you can not ignore MATLAB Bioinformatics Toolbox.

I am a big fan of Perl, and love to do all sort of analysis using Perl, therefore I prefer PDL ("Perl Data Language"), which gives standard Perl the ability to compactly store and speedily manipulate the large N-dimensional data arrays which are the bread and butter of scientific computing.

- Rahul Nayak 1432 days ago
Who cares which language is the more popular, programming languages are tools, if it does what I need it to do, it's fine by me.

- Abhimanyu Singh 1268 days ago
Lisp-Stat is an extensible environment for statistical computing and dynamic graphics based on the Lisp language. XLISP-STAT is a version of Lisp-Stat based on a dialect of Lisp called XLISP.

http://homepage.stat.uiowa.edu/~luke/xls/xlsinfo/xlsinfo.html

- John Parker 1099 days ago
I like the R language. The following table comparing the statistical capabilities of software packages: http://stanfordphd.com/Statistical_Software.html In stastistical language war, a/c to this metric, R wins

**TYPE OF STATISTICAL ANALYSIS****R****MATLAB****SAS****STATA****SPSS****Nonparametric Tests**Yes

Yes

Yes

Yes

Yes

**T-test**Yes

Yes

Yes

Yes

Yes

**ANOVA & MANOVA**Yes

Yes

Yes

Yes

Yes

**ANCOVA & MANCOVA**Yes

Yes

Yes

Yes

Yes

**Linear Regression**Yes

Yes

Yes

Yes

Yes

**Generalized Least Squares**Yes

Yes

Yes

Yes

Yes

**Ridge Regression**Yes

Yes

Yes

**Lasso**Yes

Yes

Yes

**Generalized Linear Models**Yes

Yes

Yes

Yes

Yes

**Mixed Effects Models**Yes

Yes

Yes

Yes

Yes

**Logistic Regression**Yes

Yes

Yes

Yes

Yes

**Nonlinear Regression**Yes

Yes

Yes

**Discriminant Analysis**Yes

Yes

Yes

Yes

Yes

**Nearest Neighbor**Yes

Yes

Yes

Yes

**Factor & Principal Components Analysis**Yes

Yes

Yes

Yes

Yes

**Copula Models**Yes

Yes

Experimental

**Cross-Validation**Yes

Yes

Yes

**Bayesian Statistics**Yes

Yes

Limited

**Monte Carlo, Classic Methods**Yes

Yes

Yes

Yes

Limited

**Markov Chain Monte Carlo**Yes

Yes

Yes

**Bootstrap & Jackknife**Yes

Yes

Yes

Yes

**EM Algorithm**Yes

Yes

Yes

**Missing Data Imputation**Yes

Yes

Yes

Yes

Yes

**Outlier Diagnostics**Yes

Yes

Yes

Yes

Yes

**Robust Estimation**Yes

Yes

Yes

Yes

**Longitudinal (Panel) Data**Yes

Yes

Yes

Yes

Limited

**Survival Analysis**Yes

Yes

Yes

Yes

Yes

**Path Analysis**Yes

Yes

Yes

**Propensity Score Matching**Yes

Yes

Limited

Limited

**Stratified Samples (Survey Data)**Yes

Yes

Yes

Yes

Yes

**Experimental Design**Yes

Yes

**Quality Control**Yes

Yes

Yes

Yes

**Reliability Theory**Yes

Yes

Yes

Yes

Yes

**Univariate Time Series**Yes

Yes

Yes

Yes

Limited

**Multivariate Time Series**Yes

Yes

Yes

Yes

**Markov Chains**Yes

Yes

**Hidden Markov Models**Yes

Yes

**Stochastic Volatility Models**Yes

Yes

Limited

Limited

Limited

**Diffusions**Yes

Yes

**Counting Processes**Yes

Yes

Yes

**Filtering**Yes

Yes

Limited

Limited

**Instrumental Variables**Yes

Yes

Yes

Yes

**Simultaneous Equations**Yes

Yes

Yes

Yes

**Splines**Yes

Yes

Yes

Yes

**Nonparametric Smoothing Methods**Yes

Yes

Yes

Yes

**Extreme Value Theory**Yes

Yes

**Variance Stabilization**Yes

Yes

**Cluster Analysis**Yes

Yes

Yes

Yes

Yes

**Neural Networks**Yes

Yes

Yes

Limited

**Classification & Regression Trees**Yes

Yes

Yes

Limited

**Boosting Classification & Regression Trees**Yes

Yes

**Random Forests**Yes

Yes

**Support Vector Machines**Yes

Yes

Yes

**Signal Processing**Yes

Yes

**Wavelet Analysis**Yes

Yes

Yes

**ROC Curves**Yes

Yes

Yes

Yes

Yes

**Optimization**Yes

Yes

Yes

Limited

- Neelam Jha 1093 days ago
R Passes SPSS in Scholarly Use, Stata Growing Rapidly http://www.r-bloggers.com/r-passes-spss-in-scholarly-use-stata-growing-rapidly/

- Jitendra Narayan 956 days ago
The recent article on Nature explain it better ... R becoming the most popular language amongst biological researchers http://www.nature.com/news/programming-tools-adventures-with-r-1.16609?

- Rahul Nayak 812 days ago
A nice comparision between R, SAS and Python http://www.datasciencecentral.com/forum/topics/which-one-is-best-r-sas-or-python-for-data-science