Training‎ > ‎

R and Data Mining Course

This is a short course on data mining with R. It consists of 9 sessions below. Each session will be of 1.5 hours, incl. a 1-hour tutorial and a 30m exercise.

Course Outline

Part 1 – R Programming, Data Transformation, Data Visualisation, Classification and Clustering
  • R Programming
    basics of R language and programming, parallel computing, and data import and export
  • Data Exploration and Visualisation
    summary, stats and various charts with Base R
  • Data Transformation and Visualisation with Tidyverse
    Data transformation with dplyr and tidyr and data visualisation with ggplot2
  • Regression and Classification
    linear regression and logistic regression, decision trees and random forest
  • Data Clustering
    k-means clustering, k-medoids clustering, hierarchical clustering and density-based clustering
Part 2 – Time Series Analysis, Network Analysis, Association Rules and Text Mining
  • Time Series Analysis
    time series decomposition, forecasting, classification and clustering
  • Network Analysis and Graph Mining
    graph construction, graph query, centrality measures, and graph visualisation
  • Association Rule Mining and Sequence Mining
    mining and selecting interesting association rules, redundancy removal, and rule visualisation, sequential pattern mining
  • Text Mining
    text mining, word cloud, topic modelling, and sentiment analysis
  • Big Data (optional)
    Hadoop, Spark and R


Software and Course Materials

You will need to bring your own laptop, if computers are not provided in the classroom. Please install the required software and R packages and download the datasets, slides and scripts below before coming to the course. Note that the slides are subject to change and therefore, please download its latest version when getting close to the course days.


If you have any questions or feedback, please do not hesitate to contact me on yanchang <at> Thanks.

Yanchang Zhao,
Nov 4, 2019, 5:19 PM
Yanchang Zhao,
Dec 9, 2018, 4:21 AM
Yanchang Zhao,
Jul 2, 2019, 7:54 PM