# Online Documents, Books and Tutorials

Computing for Data Analysis (with R): a free online course

YouTube playlists for the videos of the course: week 1; week 2; week 3 and week 4.

Data Analysis (with R): a free online course

Advanced R, a book for R users who want to improve their programming skills and understanding of the language

R Tips: lots of tips for R programming

Handling and Processing Strings in R: an ebook in PDF format, 105 pages

The R Manuals, including

*an Introduction to R*,*R Language Definition*,*R Data Import/Export*, and other R manualsUsing R for Data Analysis and Graphics - Introduction, Examples and Commentary

Lots of R Contributed Documents, including non-English documents

Slides on building R packages:

R Cheat Sheets by RStudio, including

Shiny Cheat Sheet

Data Visualization Cheat Sheet

Package Development Cheat Sheet

Data Wrangling Cheat Sheet

R Markdown Cheat Sheet

R Markdown Reference Guide

### Data Mining

*Introduction to Data Mining*by Pang-Ning Tan, Michael Steinbach and Vipin KumarLecture slides (in both PPT and PDF formats) and three sample Chapters on classification, association and clustering available at the above link.

Data Mining - Concepts and Techniques (3rd edition) by Jiawei Han, Micheline Kamber & Jian Pei

Lecture slides in PPT format are provided for 13 chatpers.

Tutorial on Data Mining Algorithms by Ian Witten

*Mining of Massive Datasets*by Anand Rajaraman and Jeff UllmanThe whole book and lecture slides are free and downloadable in PDF format.

Lecture notes of data mining course by Cosma Shalizi at CMU

R code examples are provided in some lecture notes, and also in solutions to home works.

It covers information retrieval, page rank, image search, information theory, categorization, clustering, transformations, principal components, factor analysis, nonlinear dimensionality reduction, regression, classification and regression trees, support vector machines, density estimation, mixture models, causal inference, etc.

Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze at Stanford University

It covers text classification, clustering, web search, link analysis, etc. The book and lecture slides are free and downloadable in PDF format.

Statistical Data Mining Tutorials by Andrew Moore

Dozens of tutorial slides in PDF format

Deep Learning

Decision Trees and Random Forest

Text Mining

It introduces various techniques at different levels of text processing, including word level, sentence level, document level and document-collection level. It covers stemming, stop words, document summarization, visualization, segmentation, categorization and clustering.

An introduction to text mining by Ian Witten

Slides for a tutorial on topic modeling by David M. Blei

Social Network Analysis and Graph Mining

Textbook on Introduction to social network methods

Tools for large graph mining: structure and diffusion, a tutorial at WWW 2008

Graph Mining: Laws, Generators and Tools

Graph-Based User Behavior Modeling: From Prediction to Fraud Detection, a tutorial at KDD 2015

Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach, a tutorial at KDD 2015

Association Rules

Lecture Notes (slides in PDF) on association rules

Outlier Detection

Sentiment Analysis

A Taste of Sentiment Analysis - 105-page slides in PDF format

Sentiment Analysis and Subjectivity, a book chapter by Bing Liu

MapReduce

Data-Intensive Text Processing with MapReduce - a book of 175 pages in PDF format

Lecture slides of a MapReduce course, which is a part of 2008 Independent Activities Period at MIT

The paper that first introduced MapReduce in 2004, showing how MapReduce works

### Data Mining with R

### Classification/Prediction with R

An Introduction to Recursive Partitioning Using the RPART Routines

Random forests for categorical dependent variables: an informal quick start R guide

### Time Series Analysis with R

Using R (with applications in Time Series Analysis)

Forecasting: Principles and Practice, a free online text book on forecasting with R, by Prof. Rob J Hyndman

### Association Rule Mining with R

### Spatial Data Analysis with R

Spatial Regression Analysis in R - A Workbook, tutorials and worked examples using R and its package spdep for spatial regression analysis

### Text Mining with R

Getting Started with Latent Dirichlet Allocation using RTextTools + topicmodels

Text Mining Infrastructure in R

Text Mining Handbook (with R code examples)

### Social Network Analysis with R

Slides on Large-scale network analysis (with package igraph)

A detailed introduction to Social Network Analysis with package sna

Package visNetwork: nice interactive visualisation of graph and networks

### Data Cleansing and Transformation with R

### Case Studies with R

Experiences with using R in credit risk at ANZ bank, a presentation by Hong Ooi at Melbourne R user group

### Big Data, MapReduce and Hadoop with R

Large Scale Distributed Data Science using Apache Spark, a tutorial at KDD 2015

### Parallel Computing with R

State of the Art in Parallel Computing with R, provides an excellent overview and comparison of R packages for parallel computing, including packages for computer cluster, grid computing and multi-core systems

Some free online documents on R and data mining are listed below.