R and Data Mining Course
This is a short course on data mining with R. It consists of 9 sessions below. Each session will be of 1.5 hours, incl. a 1-hour tutorial and a 30m exercise.
Part 1 – R Programming, Data Transformation, Data Visualisation, Classification and Clustering
basics of R language and programming, parallel computing, and data import and export
Data Exploration and Visualisation
summary, stats and various charts with Base R
Data Transformation and Visualisation with Tidyverse
Data transformation with dplyr and tidyr and data visualisation with ggplot2
Regression and Classification
linear regression and logistic regression, decision trees and random forest
k-means clustering, k-medoids clustering, hierarchical clustering and density-based clustering
Part 2 – Time Series Analysis, Network Analysis, Association Rules and Text Mining
Time Series Analysis
time series decomposition, forecasting, classification and clustering
Network Analysis and Graph Mining
graph construction, graph query, centrality measures, and graph visualisation
Association Rule Mining and Sequence Mining
mining and selecting interesting association rules, redundancy removal, and rule visualisation, sequential pattern mining
text mining, word cloud, topic modelling, and sentiment analysis
Big Data (optional)
Hadoop, Spark and R
Knowledge and experience of R, or similar programming languages
Basic knowledge of data mining and machine learning, such as
Software and Course Materials
You will need to bring your own laptop, if computers are not provided in the classroom. Please install the required software and R packages and download the datasets, slides and scripts below before coming to the course. Note that the slides are subject to change and therefore, please download its latest version when getting close to the course days.
RStudio (desktop edition)
A ZIP archive [RDM-course.zip], containing all datasets, slides and scripts for this course.
Install the required R packages by running the R script provided in file “Install-R-packages.R” in folder “code” in above ZIP archive
If you have any questions or feedback, please do not hesitate to contact me on yanchang <at> RDataMining.com. Thanks.