Resources‎ > ‎

    Free Datasets


    There are many datasets available online for free for research use. Some of them are listed below.

    If you’d like to have some datasets added to the page, please feel free to send the links to me at yanchang(at)RDataMining.com. Thanks.
    • The GeoNames geographical database
      covers all countries and contains over eight million place names, which can be used to find geocode for countries, cities, suburbs, places and postcodes.
    • The R Datasets Package:
      see a list of data in R datasets package with the statement below
      > library(help="datasets")
    • Frequent Itemset Mining Dataset Repository:
      click-stream data, retail market basket data, traffic accident data and web html document data (large size!).
      See the website also for implementations of many algorithms for frequent itemset and association rule mining.
    • ACM KDD Cup:
      the annual Data Mining and Knowledge Discovery competition organized by ACM SIGKDD, targeting real-world problems
    • UCI KDD Archive:
      an online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas
    • EconData:
      A source of economic time series data from Inforum, at the University of Maryland