Below is a list of big data platforms and their interfaces with R.


  • Hadoop (or YARN) - a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models
  • RHadoop - a collection of five R packages that allow users to manage and analyze data with Hadoop, developed by Revolution Analytics
  • RHIPE - an R and Hadoop Integrated Programming Environment
  • Hortonworks HDP


  • Spark - a fast and general engine for large-scale data processing, which can be 100 times faster than Hadoop
  • SparkR - R frontend for Spark