Online Documents, Books and Tutorials
Computing for Data Analysis (with R): a free online course
YouTube playlists for the videos of the course: week 1; week 2; week 3 and week 4.
Data Analysis (with R): a free online course
Advanced R, a book for R users who want to improve their programming skills and understanding of the language
R Tips: lots of tips for R programming
Handling and Processing Strings in R: an ebook in PDF format, 105 pages
The R Manuals, including an Introduction to R, R Language Definition, R Data Import/Export, and other R manuals
Using R for Data Analysis and Graphics - Introduction, Examples and Commentary
Lots of R Contributed Documents, including non-English documents
Slides on building R packages:
R Cheat Sheets by RStudio, including
Shiny Cheat Sheet
Data Visualization Cheat Sheet
Package Development Cheat Sheet
Data Wrangling Cheat Sheet
R Markdown Cheat Sheet
R Markdown Reference Guide
Data Mining
Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach and Vipin Kumar
Lecture slides (in both PPT and PDF formats) and three sample Chapters on classification, association and clustering available at the above link.
Data Mining - Concepts and Techniques (3rd edition) by Jiawei Han, Micheline Kamber & Jian Pei
Lecture slides in PPT format are provided for 13 chatpers.
Tutorial on Data Mining Algorithms by Ian Witten
Mining of Massive Datasets by Anand Rajaraman and Jeff Ullman
The whole book and lecture slides are free and downloadable in PDF format.
Lecture notes of data mining course by Cosma Shalizi at CMU
R code examples are provided in some lecture notes, and also in solutions to home works.
It covers information retrieval, page rank, image search, information theory, categorization, clustering, transformations, principal components, factor analysis, nonlinear dimensionality reduction, regression, classification and regression trees, support vector machines, density estimation, mixture models, causal inference, etc.
Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze at Stanford University
It covers text classification, clustering, web search, link analysis, etc. The book and lecture slides are free and downloadable in PDF format.
Statistical Data Mining Tutorials by Andrew Moore
Dozens of tutorial slides in PDF format
Deep Learning
Decision Trees and Random Forest
Text Mining
It introduces various techniques at different levels of text processing, including word level, sentence level, document level and document-collection level. It covers stemming, stop words, document summarization, visualization, segmentation, categorization and clustering.
An introduction to text mining by Ian Witten
Slides for a tutorial on topic modeling by David M. Blei
Social Network Analysis and Graph Mining
Textbook on Introduction to social network methods
Tools for large graph mining: structure and diffusion, a tutorial at WWW 2008
Graph Mining: Laws, Generators and Tools
Graph-Based User Behavior Modeling: From Prediction to Fraud Detection, a tutorial at KDD 2015
Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach, a tutorial at KDD 2015
Association Rules
Lecture Notes (slides in PDF) on association rules
Outlier Detection
Sentiment Analysis
A Taste of Sentiment Analysis - 105-page slides in PDF format
Sentiment Analysis and Subjectivity, a book chapter by Bing Liu
MapReduce
Data-Intensive Text Processing with MapReduce - a book of 175 pages in PDF format
Lecture slides of a MapReduce course, which is a part of 2008 Independent Activities Period at MIT
The paper that first introduced MapReduce in 2004, showing how MapReduce works
Data Mining with R
Classification/Prediction with R
An Introduction to Recursive Partitioning Using the RPART Routines
Random forests for categorical dependent variables: an informal quick start R guide
Time Series Analysis with R
Using R (with applications in Time Series Analysis)
Forecasting: Principles and Practice, a free online text book on forecasting with R, by Prof. Rob J Hyndman
Association Rule Mining with R
Spatial Data Analysis with R
Spatial Regression Analysis in R - A Workbook, tutorials and worked examples using R and its package spdep for spatial regression analysis
Text Mining with R
Getting Started with Latent Dirichlet Allocation using RTextTools + topicmodels
Text Mining Infrastructure in R
Text Mining Handbook (with R code examples)
Social Network Analysis with R
Slides on Large-scale network analysis (with package igraph)
A detailed introduction to Social Network Analysis with package sna
Package visNetwork: nice interactive visualisation of graph and networks
Data Cleansing and Transformation with R
Case Studies with R
Experiences with using R in credit risk at ANZ bank, a presentation by Hong Ooi at Melbourne R user group
Big Data, MapReduce and Hadoop with R
Large Scale Distributed Data Science using Apache Spark, a tutorial at KDD 2015
Parallel Computing with R
State of the Art in Parallel Computing with R, provides an excellent overview and comparison of R packages for parallel computing, including packages for computer cluster, grid computing and multi-core systems
Some free online documents on R and data mining are listed below.