Seminar and Conference News

Tutorials on Machine Learning with Rattle and R, Melbourne, 1 June 2017

posted May 21, 2017, 4:46 PM by Yanchang Zhao   [ updated May 21, 2017, 4:46 PM ]

Dr. Graham Williams and myself will run tutorials on Machine Learning with Rattle and R – Decision Trees, Ensemble Models, Association Rules, Text Mining and Social Network Analysis, at the Melbourne Data Science Week, 1 June 2017. Only 3 spots left.

See tutorial details at



Register at

AusDM 2017: submission deadline extended to 22 May

posted May 7, 2017, 4:49 PM by Yanchang Zhao   [ updated May 7, 2017, 4:50 PM ]

AusDM 2017 will be a special event this year being held in conjunction with IJCAI in Melbourne. This is a tremendous opportunity to present data mining research from Australia to a wider audience, with collaborative arrangements with IJCAI to invite wider participation.

 Submissions are required by 5pm Monday 22 May 2017. Visit for details.

Melbourne Data Science Week

posted Apr 20, 2017, 5:22 AM by Yanchang Zhao   [ updated Apr 20, 2017, 5:27 AM ]

Melbourne Data Science Week
29 May - 2 June 2017

Two sold out events from 2016 are combining in 2017 to create what will hopefully be a great Data Science-palooza for Melbourne. Learn about applications, data, ideas and the latest tools for data science. Participate in panel sessions and break-time discussions with your colleagues from industry, academia and government. Hear from the datathon winners about how they did it.
For those who want hands on Data Science training there will be 8 full day tutorials from Mon-Thu.

I will run a tutorial on Machine Learning with R on 1 June, covering association rules, text mining and social network analysis. See details of the tutorial at

The tutorials are 80% full and will shortly sell out, so reserve your place now at

Seminar: Scalable Machine Learning for R, by Joseph Blue, Director Global Data Science, MapR

posted Dec 10, 2016, 11:36 AM by Yanchang Zhao   [ updated Dec 10, 2016, 11:38 AM ]

Title: Scalable Machine Learning for R with MapR
Speaker: Joseph Blue, Director of Global Data Science for MapR
Time and Date: 4:30-6:30pm, Thursday, 15 December 2016
Location: 6C35 (building 6, room C35), University of Canberra

In this discussion, we will review some of the popular libraries that allow R users to interact with large-scale and unstructured data sets through the use of a distributed environment. After a brief intro to deep learning concepts, a demo will be shown that leverages H2O from R to detect anomalies in recent Australian stock prices. The RMarkdown notebook and other resources will be provided to attendees. Additionally, any questions from the audience involving the use of R in distributed environments can be addressed.

Joe is the Director of Global Data Science for MapR, where he has focused on collaborating with customers to explore large, unstructured data sources and derive real business value for the past 3 years. He has deployed solutions in financial, healthcare, advertising, manufacturing, retail and media markets.

Prior to joining MapR, Joe developed predictive models in health care for Optum (a division of UnitedHealth) as Chief Scientist. He was the first Fellow for Optum's start-up, Optum Labs and has several patents pending.

Before his time at Optum, Joe accumulated over 10 years of data science experience at Fair Isaac, Lexis Nexis, HNC Software and ID Analytics (now LifeLock) specializing in business problems such as fraud & anomaly detection, which yielded a patent for identity theft detection.

Seminar: Social Network 2.0 – from sharing experiences to sharing values, Prof Xue Li, Canberra, Tuesday 6 Dec

posted Nov 21, 2016, 4:11 AM by Yanchang Zhao   [ updated Nov 21, 2016, 4:12 AM ]

Topic: Social Network 2.0 – from sharing experiences to sharing values
Speaker: Prof. Xue Li, University of Queensland
Date and time: 4:30-5:30pm, Tuesday 6 Dec 2016
Location: 6C34 (Building 6, room C34), University of Canberra

In current social networks, people share their information, feelings, and experiences with or without their true online identities. For the last 10 years, we have experienced many problems such as spamming, cyberbullying, and misusage of social media for causing problems such as London Riots. One key question is about whether we are OK to associate our true identity with the cyber identity. In social network connected services such as Uber and Airbnb, people are using there traceable identities in the cyberspace, where all transactions are protected by insurance companies or by law. So the question is: if people are willing to use their true identity online in order to share services and values, how can we make social networks a trust-worthy place? In this talk, we discuss the concept of Social Network 2.0 as an emerging type of social networks where people are OK to use their true identities to come together in order to share the services and values. We will present our understanding on the problems on this type of new generation social networks and some of our research initiatives.

About the speaker:
Dr Xue Li is a Professor in DKE (Data and Knowledge Engineering) Division, School of Information Technology, the University of Queensland in Australia. He obtained a BSc in Computer Science in Chongqing University, China in 1982, a MSc in the University of Queensland in 1989, and a PhD in Information Systems in QUT 1997. His research interests are in data mining, intelligent information systems, and social computing. He has over 160 publications as monograph, edited books, book chapters, and journal and conference papers. He was recognized as one of the Top-50 “Most Powerful People in Australia” in 2015 by Australian Financial Review. He has successfully supervised 17 PhD candidates to completion as their Principal Supervisor. He is currently a Chief Investigator for three ARC (Australian Research Council) Funding Projects. He is an Associate Editor of Journal of Advanced Internet of Things.

Seminar: Exploring causal relationships in observational data, Prof. Jiuyong Li. Canberra, 4:15pm Wed 16 Nov 2016

posted Nov 1, 2016, 5:45 PM by Yanchang Zhao   [ updated Nov 1, 2016, 5:46 PM ]

Topic: Exploring causal relationships in observational data
Speaker: Prof. Jiuyong Li, University of South Australia
Date and time: 4:15-5:15pm, Wednesday 16 Nov 2016
Location: Visitor Centre theatre, AIS, Bruce, Canberra

Association analysis is an important technology in data mining, and has been widely used in many application areas. However, associations in data can be spurious and they do not indicate causal-effect relationships that are ultimate goals for many scientific explorations and social studies.  While the techniques for association discovery become mature, the problem for identifying non-spurious associations becomes prominent. In this talk, I will discuss some current methods for causal relationship discovery and our research work in this direction.

About the speaker:
Dr Jiuyong Li is a Professor and an Associate Head of School at the School of Information Technology and Mathematical Sciences of University of South Australia. He leads the Data Analytics Group in the School.  His main research interests are in data mining, bioinformatics, and data privacy. He has led five Australian Research Council Discovery projects and leads a Data to Decision CRC project. He has published more than 100 papers, mostly in leading journals and conferences in the areas. His software tools have been used in several real world projects. He has been a chair (or a PC chair) of multiple Australasian data mining and artificial intelligence conferences and international causal discovery workshops. He has received senior visiting fellowships from Nokia Foundation, the Australian Academy of Science, and Japan Society of Promotion of Science.

Free Short Course on R and Data Mining, University of Canberra, Fri 7 Oct 2016

posted Sep 20, 2016, 4:04 AM by Yanchang Zhao   [ updated Sep 22, 2016, 4:55 AM ]

Short Course on R and Data Mining

Information Technology and Engineering, University of Canberra

Target Audience: IT&E Project students (ICTP students, Technology Project students, ISES students), IT&E HDR students and Staff, members of the Canberra Data Scientists Group

Fees: There is no fees for the short course but seats are limited to 60 – so register early through

Presenters: Dr Yanchang Zhao (Adjunct Professor, UC), Professor Dharmendra Sharma

Time: 9:30am – 12:30pm, Fri 7 Oct 2016

Room: 2B7 (Building 2, room B7, University of Canberra)

Map and Parking:

Course Outline:

The course will cover R programming, data exploration and visualisation, and data mining with R. It will cover four topics below in two sessions. Each 1.5-hour session will consist of presentations on two topics, followed by lab for students to do exercises.

- R Programming and Data Exploration and Visualisation with R

- Regression and Classification with R

- Association Rule Mining with R

- Text Mining with R -- an Analysis of Twitter Data

Instructions, prerequisites and slides for the course are or will be available at

Seminar: Entity Extraction from Text Documents, University of Canberra, Tuesday 11 Oct 2016

posted Sep 6, 2016, 6:28 AM by Yanchang Zhao   [ updated Oct 5, 2016, 4:04 AM ]

Sorry, this seminar has been cancelled.

Topic: Entity Extraction from Text Documents
Speaker: Dr. Patrick Sun
Time and date: 4:30-6:30pm, Tuesday 11 Oct 2016
Location: 2B2 (building 2, room B2), University of Canberra

Identity search/bulk match is an important capability for law enforcement agencies. In order to search and bulk match the identities mentioned in text documents, the entities like names, date of birth, address, phone numbers, etc. must be extracted out from the texts and stored in a structured format for Identity Resolution tools. In this presentation, we will briefly introduce our entity extraction framework, evaluation process and present the evaluation results.

Seminar: Detecting Persistent Threats using Sequence Statistics, Canberra, 5 Sept 2016

posted Aug 29, 2016, 4:04 AM by Yanchang Zhao   [ updated Aug 29, 2016, 4:05 AM ]

Topic: Detecting Persistent Threats using Sequence Statistics

Speaker: Dr. Ted Dunning, Chief Application Architect at MapR

Time and date: 4:30-6:30pm, Monday 5 September 2016

Location: 40 Cameron Avenue in Belconnen at HP Enterprise



In a persistent threat, the attacker often penetrates a system but exploits information captured there elsewhere at a throttled rate to avoid detection. In some cases, the attacker even takes measures to protect the penetrated system from other attackers to avoid the detailed inspection that often accompanies the detection of a compromise. I will describe one particular kind of situation in which a single point of compromise is used to extract consumer financial information that is then used elsewhere to commit fraud. This kind of attack can be difficult to detect and hard to trace. In fact, however, detailed examination of transaction histories across thousands to millions of accounts can provide a very sensitive indicator of such activity and can often pin-point the original point of compromise. The detection technique that I will describe has very broad applicability across many problems that involve sequences of symbols and has produced state-of-art results in genomics, fraud detection, text analysis, retail recommendations and predicting attrition and profitability. The specific case that I describe in this talk is also interesting since the technique was initially developed using synthetic data which emulated real data closely enough that a fraud ring was detected the first time out.

About the speaker:

Ted Dunning is Chief Application Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects. Ted has been very active in mentoring new Apache projects and is currently serving as vice president of incubation for the Apache Software Foundation. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (LifeLock) and he has 24 patents issued to date and a dozen pending. Ted has a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting. 


Pizzas and drinks will be sponsored by MapR. The venue will be sponsored by HP Enterprise.

CFP: AusDM 2016 paper submission extended to 2 Sept

posted Aug 19, 2016, 5:40 AM by Yanchang Zhao   [ updated Aug 19, 2016, 5:40 AM ]

14th Australasian Data Mining Conference (AusDM 2016)
Canberra, Australia,
6-8 December 2016
Join us on LinkedIn:

The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data mining. AusDM'16 seeks to showcase: Research Prototypes; Industry Case Studies; Practical Analytics Technology; and Research Student Projects.

Publication and topics
We are calling for papers, both research and applications, and from both academia and industry, for presentation at the conference. Accepted papers will be published in an up-coming volume (Data Mining and Analytics 2016) of the Conferences in Research and Practice in Information Technology (CRPIT) series by the Australian Computer Society which is also held in full-text on the ACM Digital Library. AusDM invites contributions addressing current research in data mining and knowledge discovery as well as experiences, novel applications and future challenges.

Submission of papers
- Academic submissions: Regular academic submissions can be made in Research Track reporting on research progress, with a paper length of between 8 and 12 pages in CRPIT style.
- Industry submissions: Submissions can be made in the Application Track to report on specific data mining implementations and experiences in governments and industry projects. Submissions in this category can be between 4 and 8 pages in CRPIT style.
- Industry Showcase submissions: Submission from industry and government on an analytics solution that has raised profits, reduced costs and/or achieved other important policy and/or business outcomes can be made in this track with a one page Abstract only.

Online submission system

Important Dates
Paper Submission: extended to 6pm, Friday 2 Sept 2016, Australian Eastern Standard Time (AEST)
Authors Notified: Monday 24 October 2016
Camera Ready Submission: Monday 7 November 2016
Conference Dates: 6-8 December 2016

1-10 of 97