Free Stanford online course on Statistical Learning (with R) Covers linear and polynomial regression, logistic regression and
linear discriminant analysis; cross-validation and the bootstrap,
model selection and regularization methods (ridge and lasso);
nonlinear models, splines and generalized additive models; tree-based
methods, random forests and boosting; support-vector machines.
Introduction to Data Science The lectures in week 3 give an excellent introduction to MapReduce and Hadoop, and demonstrate with examples how to use MapReduce to do various tasks.
Statistics and R for the Life Sciences An introduction to basic statistical concepts and R programming skills necessary for analyzing data in the life sciences