Online Documents, Books and Tutorials
Advanced R, a book for R users who want to improve their programming skills and understanding of the language
R Tips: lots of tips for R programming
Handling and Processing Strings in R: an ebook in PDF format, 105 pages
The R Manuals, including an Introduction to R, R Language Definition, R Data Import/Export, and other R manuals
Lots of R Contributed Documents, including non-English documents
Slides on building R packages:
R Cheat Sheets by RStudio, including
Shiny Cheat Sheet
Data Visualization Cheat Sheet
Package Development Cheat Sheet
Data Wrangling Cheat Sheet
R Markdown Cheat Sheet
R Markdown Reference Guide
Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach and Vipin Kumar
Lecture slides (in both PPT and PDF formats) and three sample Chapters on classification, association and clustering available at the above link.
Data Mining - Concepts and Techniques (3rd edition) by Jiawei Han, Micheline Kamber & Jian Pei
Lecture slides in PPT format are provided for 13 chatpers.
Tutorial on Data Mining Algorithms by Ian Witten
Mining of Massive Datasets by Anand Rajaraman and Jeff Ullman
The whole book and lecture slides are free and downloadable in PDF format.
Lecture notes of data mining course by Cosma Shalizi at CMU
R code examples are provided in some lecture notes, and also in solutions to home works.
It covers information retrieval, page rank, image search, information theory, categorization, clustering, transformations, principal components, factor analysis, nonlinear dimensionality reduction, regression, classification and regression trees, support vector machines, density estimation, mixture models, causal inference, etc.
Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze at Stanford University
It covers text classification, clustering, web search, link analysis, etc. The book and lecture slides are free and downloadable in PDF format.
Statistical Data Mining Tutorials by Andrew Moore
Dozens of tutorial slides in PDF format
Decision Trees and Random Forest
It introduces various techniques at different levels of text processing, including word level, sentence level, document level and document-collection level. It covers stemming, stop words, document summarization, visualization, segmentation, categorization and clustering.
An introduction to text mining by Ian Witten
Slides for a tutorial on topic modeling by David M. Blei
Social Network Analysis and Graph Mining
Tools for large graph mining: structure and diffusion, a tutorial at WWW 2008
Graph-Based User Behavior Modeling: From Prediction to Fraud Detection, a tutorial at KDD 2015
Lecture Notes (slides in PDF) on association rules
Data Mining with R
Classification/Prediction with R
Time Series Analysis with R
Forecasting: Principles and Practice, a free online text book on forecasting with R, by Prof. Rob J Hyndman
Association Rule Mining with R
Spatial Data Analysis with R
Spatial Regression Analysis in R - A Workbook, tutorials and worked examples using R and its package spdep for spatial regression analysis
Text Mining with R
Social Network Analysis with R
Slides on Large-scale network analysis (with package igraph)
Data Cleansing and Transformation with R
Case Studies with R
Experiences with using R in credit risk at ANZ bank, a presentation by Hong Ooi at Melbourne R user group
Big Data, MapReduce and Hadoop with R
Large Scale Distributed Data Science using Apache Spark, a tutorial at KDD 2015
Parallel Computing with R
State of the Art in Parallel Computing with R, provides an excellent overview and comparison of R packages for parallel computing, including packages for computer cluster, grid computing and multi-core systems
Some free online documents on R and data mining are listed below.