Free Stanford online course on Statistical Learning (with R) Covers linear and polynomial regression, logistic regression and
linear discriminant analysis; cross-validation and the bootstrap,
model selection and regularization methods (ridge and lasso);
nonlinear models, splines and generalized additive models; tree-based
methods, random forests and boosting; support-vector machines.
Introduction to Data Science The lectures in week 3 give an excellent introduction to MapReduce and Hadoop, and demonstrate with examples how to use MapReduce to do various tasks.