This is a short course on data mining with R. It consists of 9 sessions below. Each session will be of 1.5 hours, incl. a 1-hour tutorial and a 30m exercise.

You will need to bring your own laptop, if computers are not provided in the classroom. Please install the required software and R packages and download the datasets, slides and scripts below before coming to the course.

### Course Outline

Part 1 – R Programming, Data Transformation, Data Visualisation, Classification and Clustering- R Programming

basics of R language and programming, parallel computing, and data import and export - Data Exploration and Visualisation

summary, stats and various charts with Base R - Data Transformation and Visualisation with Tidyverse

Data transformation with dplyr and tidyr and data visualisation with ggplot2 - Regression and Classification

linear regression and logistic regression, decision trees and random forest - Data Clustering

k-means clustering, k-medoids clustering, hierarchical clustering and density-based clustering

- Time Series Analysis

time series decomposition, forecasting, classification and clustering - Network Analysis and Graph Mining

graph construction, graph query, centrality measures, and graph visualisation - Association Rule Mining and Sequence Mining

mining and selecting interesting association rules, redundancy removal, and rule visualisation, sequential pattern mining - Text Mining

text mining, word cloud, topic modelling, and sentiment analysis - Big Data (optional)

Hadoop, Spark and R

### Prerequisites

- Knowledge and experience of R, or similar programming languages
- Basic knowledge of data mining and machine learning, such as

**Software and Course Materials**

You will need to bring your own laptop, if computers are not provided in the classroom. Please install the required software and R packages and download the datasets, slides and scripts below before coming to the course. *Note that the slides are subject to change and therefore, please download its latest version when getting close to the course days.*

- Software
- R

http://www.r-project.org/ - RStudio (desktop edition)

http://www.rstudio.com/products/rstudio/download/ - Course Materials
- A ZIP archive [RDM-course.zip], containing all datasets, slides and scripts for this course.
- R Packages
- Install the required R packages by running the R script provided in file “
*Install-R-packages.R*” in folder “*code*” in above ZIP archive

If you have any questions or feedback, please do not hesitate to contact me on yanchang <at> RDataMining.com. Thanks.