Free Datasets
There are many datasets available online for free for research use. Some of them are listed below.
If you’d like to have some datasets added to the page, please feel free to send the links to me at yanchang(at)RDataMining.com. Thanks.
Geocoded National Address File (G-NAF)
more than 13 million Australian physical address records with geocodes
The GeoNames geographical database
covers all countries and contains over eight million place names, which can be used to find geocode for countries, cities, suburbs, places and postcodes.
Airport, airline and route data
6977 airports, 5888 airlines and 59036 routes spanning the globe
GDELT: Global Data on Events, Location and Tone
containing over 200-million geolocated events for 1979 to present
The R Datasets Package:
see a list of data in R datasets package with the statement below
> library(help="datasets")
click-stream data, retail market basket data, traffic accident data and web html document data (large size!).
See the website also for implementations of many algorithms for frequent itemset and association rule mining.
the annual Data Mining and Knowledge Discovery competition organized by ACM SIGKDD, targeting real-world problems
an online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas
a collection of databases, domain theories, and data generators
spatial data
a collection of about 800 time series drawn from many different fields
A source of economic time series data from Inforum, at the University of Maryland
data for time series classification and clustering
Free GIS data at Geoscience Australia
Datasets on Social Network:
and
Public datasets from government
and