This page shows a basic exploration of iris data with R.
Check the dimensionality > dim(iris)[1] 150 5Variable names or column names > names(iris)[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species" Structure > str(iris)'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...Attributes > attributes(iris) $names[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species" $row.names [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [21] 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 [41] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 [61] 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [81] 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100[101] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120[121] 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140[141] 141 142 143 144 145 146 147 148 149 150$class[1] "data.frame"Get the first 5 rows > iris[1:5,] Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosaGet Sepal.Length of the first 10 rows > iris[1:10, "Sepal.Length"] [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9Same as above > iris$Sepal.Length[1:10] [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9Distribution of every variable > summary(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50 Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Frequency > table(iris$Species) setosa versicolor virginica 50 50 50 Pie chart > pie(table(iris$Species))Variance of Sepal.Length > var(iris$Sepal.Length)[1] 0.6856935Covariance of two variables >
cov(iris$Sepal.Length, iris$Petal.Length)[1] 1.274315Correlation of two variables > cor(iris$Sepal.Length,
iris$Petal.Length)[1]
0.8717538Histogram >
hist(iris$Sepal.Length)Density > plot(density(iris$Sepal.Length))Scatter plot > plot(iris$Sepal.Length,
iris$Sepal.Width)Pair plot > plot(iris)or > pairs(iris)More examples on data exploration with R and other data mining techniques can be found in my book "R and Data Mining: Examples and Case Studies", which is downloadable as a .PDF file at the link. |




