A data set that have more than one dimension is conceptually hard to store as a vector. For two-dimensional data set the solution to this is to instead use matrices or data frames. As with vectors all values in a matrix has to be of the same type (eg. you can not mix for example characters and numerics in the same matrix). For data frames this is not a requirement and different columns can have different modes, but all columns in a data frame have the same number of entries. In addition to these R also have objects named lists that can store any type of data set and are not restricted by types or dimensions.
In this exercise you will learn how to:
The command to create a matrix in R is matrix()
.
As input it takes a vector of values, the number of
rows and the number of columns.
X <- matrix(1:12, nrow = 4, ncol = 3)
X
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
Note that if one only specify the number of rows or columns the it will infer the size of the matrix automatically using the size of vector and the option given. The default way of filling the matrix is column-wise, so the first values from the vector ends up in column 1 of the matrix. If you instead wants to fill the matrix row by row you can set the byrow flag to TRUE.
X <- matrix(1:12, nrow = 4, ncol = 3, byrow = TRUE)
X
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12
Subsetting a matrix is done the same way as for vectors, but you have more than one dimension to work with. So you specify the rows and column needed.
X[1,2]
## [1] 2
If one wants all values in a column or a row this can be specified by leaving the other dimension empty, hence this code will print all values in the second column.
X[,2]
## [1] 2 5 8 11
Note that if the retrieved part of a matrix can be represented as a vector (eg one of the dimension have the length 1) R will convert it to a vector otherwise it will still be a matrix.
Create a matrix containing 1:12 as shown similar to the matrix X above.
mode(X)
length(X)
## [1] "numeric"
## [1] 12
X[X>6]
## [1] 7 10 8 11 9 12
X[,c(3,2,1)]
## [,1] [,2] [,3]
## [1,] 3 2 1
## [2,] 6 5 4
## [3,] 9 8 7
## [4,] 12 11 10
X.2 <- rbind(X, rep(0, 3))
X.2
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12
## [5,] 0 0 0
NA
.
X[,1:2] <- NA
X
## [,1] [,2] [,3]
## [1,] NA NA 3
## [2,] NA NA 6
## [3,] NA NA 9
## [4,] NA NA 12
X[] <- 0
as.vector(X)
## [1] 0 0 0 0 0 0 0 0 0 0 0 0
outer()
that generates matrices was mentioned. Try to generate the same vector as yesterday using this function instead. The outer()
function is very powerful, but can be hard to wrap you head around, so try to follow the logic, perhaps by creating a simple example to start with.letnum <- outer(paste("Geno",letters[1:19], sep = "_"), 1:3, paste, sep = "_")
class(letnum)
sort(as.vector(letnum))
## [1] "matrix" "array"
## [1] "Geno_a_1" "Geno_a_2" "Geno_a_3" "Geno_b_1" "Geno_b_2" "Geno_b_3"
## [7] "Geno_c_1" "Geno_c_2" "Geno_c_3" "Geno_d_1" "Geno_d_2" "Geno_d_3"
## [13] "Geno_e_1" "Geno_e_2" "Geno_e_3" "Geno_f_1" "Geno_f_2" "Geno_f_3"
## [19] "Geno_g_1" "Geno_g_2" "Geno_g_3" "Geno_h_1" "Geno_h_2" "Geno_h_3"
## [25] "Geno_i_1" "Geno_i_2" "Geno_i_3" "Geno_j_1" "Geno_j_2" "Geno_j_3"
## [31] "Geno_k_1" "Geno_k_2" "Geno_k_3" "Geno_l_1" "Geno_l_2" "Geno_l_3"
## [37] "Geno_m_1" "Geno_m_2" "Geno_m_3" "Geno_n_1" "Geno_n_2" "Geno_n_3"
## [43] "Geno_o_1" "Geno_o_2" "Geno_o_3" "Geno_p_1" "Geno_p_2" "Geno_p_3"
## [49] "Geno_q_1" "Geno_q_2" "Geno_q_3" "Geno_r_1" "Geno_r_2" "Geno_r_3"
## [55] "Geno_s_1" "Geno_s_2" "Geno_s_3"
A. A * B
B. A / B
C. A %x% B
D. A + B
E. A - B
F. A == B
A <- matrix(1:4, ncol = 2, nrow = 2)
B <- matrix(5:8, ncol = 2, nrow = 2)
A
B
A * B
A / B
A %x% B
A + B
A - B
A == B
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
## [,1] [,2]
## [1,] 5 7
## [2,] 6 8
## [,1] [,2]
## [1,] 5 21
## [2,] 12 32
## [,1] [,2]
## [1,] 0.2000000 0.4285714
## [2,] 0.3333333 0.5000000
## [,1] [,2] [,3] [,4]
## [1,] 5 7 15 21
## [2,] 6 8 18 24
## [3,] 10 14 20 28
## [4,] 12 16 24 32
## [,1] [,2]
## [1,] 6 10
## [2,] 8 12
## [,1] [,2]
## [1,] -4 -4
## [2,] -4 -4
## [,1] [,2]
## [1,] FALSE FALSE
## [2,] FALSE FALSE
e <- rnorm(n = 100)
E <- matrix(e, nrow = 10, ncol = 10)
colnames(E) <- LETTERS[1:10]
rownames(E) <- colnames(E)
E.means <- rowMeans(E)
E.medians <- apply(E, MARGIN = 1, median)
E.mm <- rbind(E.means, E.medians)
E.mm
## A B C D E F
## E.means 0.2392437 0.1359768 -0.318607 -0.2821697 -0.13268083 -0.3527741
## E.medians -0.1147574 0.2144866 -0.242131 -0.3021635 -0.02672345 -0.4196171
## G H I J
## E.means -0.3133560 -0.5890435 0.0266760 0.007809547
## E.medians -0.2738144 -0.6997814 0.2964951 -0.173624840
Even though vectors are at the very base of R usage, data frames are central to R as the most common ways to import data into R (read.table()
) will create a dataframe. Even though a dataframe can itself contain another dataframe, by far the most common dataframes consists of a set of equally long vectors. As dataframes can contain several different data types the command str()
is very useful to run on dataframes.
vector1 <- 1:10
vector2 <- letters[1:10]
vector3 <- rnorm(10, sd = 10)
dfr <- data.frame(vector1, vector2, vector3)
str(dfr)
## 'data.frame': 10 obs. of 3 variables:
## $ vector1: int 1 2 3 4 5 6 7 8 9 10
## $ vector2: chr "a" "b" "c" "d" ...
## $ vector3: num 0.504 -3.111 4.443 -11.02 9.375 ...
In the above example, we can see that the dataframe dfr contains 10 observations for three variables that all have different modes, column 1 is an integer vector, column 2 a vector with factors and column 3 a numeric vector. It is noteworthy that the second column is a factor even though we just gave it a character vector.
data.frame
to find an argument that turns off the factor conversion.
dfr <- data.frame(vector1, vector2, vector3, stringsAsFactors = FALSE)
str(dfr)
## 'data.frame': 10 obs. of 3 variables:
## $ vector1: int 1 2 3 4 5 6 7 8 9 10
## $ vector2: chr "a" "b" "c" "d" ...
## $ vector3: num 0.504 -3.111 4.443 -11.02 9.375 ...
dfr[,2:3]
dfr[,c("vector2", "vector3")]
## vector2 vector3
## 1 a 0.5044867
## 2 b -3.1112610
## 3 c 4.4430804
## 4 d -11.0200372
## 5 e 9.3749115
## 6 f -9.7971504
## 7 g -6.7159918
## 8 h -3.3661281
## 9 i -3.2607665
## 10 j 7.8692568
## vector2 vector3
## 1 a 0.5044867
## 2 b -3.1112610
## 3 c 4.4430804
## 4 d -11.0200372
## 5 e 9.3749115
## 6 f -9.7971504
## 7 g -6.7159918
## 8 h -3.3661281
## 9 i -3.2607665
## 10 j 7.8692568
dfr[dfr$vector3>0,2]
dfr$vector2[dfr$vector3>0]
## [1] "a" "c" "e" "j"
## [1] "a" "c" "e" "j"
paste(dfr$vector1, dfr$vector2, dfr$vector3, sep = "_")
## [1] "1_a_0.50448665078382" "2_b_-3.11126098244682" "3_c_4.44308038050025"
## [4] "4_d_-11.0200371807646" "5_e_9.37491149605183" "6_f_-9.79715041183984"
## [7] "7_g_-6.71599175633236" "8_h_-3.36612809265744" "9_i_-3.2607665137178"
## [10] "10_j_7.86925683848526"
mtcars
. How many rows and columns does it have?
dim(mtcars)
ncol(mtcars)
nrow(mtcars)
## [1] 32 11
## [1] 11
## [1] 32
car.names <- sample(row.names(mtcars))
random1 <- rnorm(length(car.names))
random2 <- rnorm(length(car.names))
mtcars2 <- data.frame(car.names, random1, random2)
mtcars2
## car.names random1 random2
## 1 Mazda RX4 0.53459312 0.46119208
## 2 Honda Civic -0.63517797 0.38909594
## 3 Mazda RX4 Wag -2.64318967 2.08763579
## 4 Pontiac Firebird 1.63522063 1.03393445
## 5 Volvo 142E -0.31238340 -0.19329538
## 6 Merc 280 0.35407135 -0.92960114
## 7 Merc 280C -1.47532809 0.47123238
## 8 Porsche 914-2 1.62734897 1.34271701
## 9 Chrysler Imperial 1.17438999 -1.68081962
## 10 Valiant 0.92249893 0.44603916
## 11 Cadillac Fleetwood 2.17916616 1.07810348
## 12 Fiat X1-9 -0.55393602 -1.29725411
## 13 Merc 450SE -1.53166747 1.15754997
## 14 Datsun 710 0.48651157 2.05527741
## 15 Maserati Bora 0.06466485 -0.85258263
## 16 AMC Javelin -0.01075047 0.56435896
## 17 Merc 450SL 1.16034984 -0.93292118
## 18 Fiat 128 -0.43842195 0.43170158
## 19 Camaro Z28 -1.10200938 -0.97997879
## 20 Dodge Challenger -0.50923490 -0.60382001
## 21 Lincoln Continental 0.80240229 -1.10110625
## 22 Hornet 4 Drive 0.87986209 -0.52873625
## 23 Duster 360 -0.97982140 0.16185770
## 24 Toyota Corolla -0.23863300 -0.40280791
## 25 Merc 230 1.64110298 0.42983730
## 26 Merc 450SLC -0.45183263 -0.07326107
## 27 Hornet Sportabout 1.34839913 0.40572716
## 28 Lotus Europa 0.04288574 0.58922798
## 29 Ford Pantera L -0.54416521 1.90327435
## 30 Merc 240D -0.73653460 -2.18088752
## 31 Toyota Corona -0.98152254 -1.72588278
## 32 Ferrari Dino -1.16427039 0.15284506
mt.merged <- merge(mtcars, mtcars2, by.x = "row.names", by.y = "car.names")
mt.merged
## Row.names mpg cyl disp hp drat wt qsec vs am gear carb
## 1 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## 2 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## 3 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## 5 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## 6 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## 7 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## 8 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## 9 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## 10 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## 11 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## 12 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## 13 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## 14 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## 15 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## 16 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## 17 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## 18 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## 19 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## 20 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## 21 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## 22 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## 23 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## 24 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## 25 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## 26 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## 27 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## 28 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## 29 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## 30 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## 31 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## 32 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## random1 random2
## 1 -0.01075047 0.56435896
## 2 2.17916616 1.07810348
## 3 -1.10200938 -0.97997879
## 4 1.17438999 -1.68081962
## 5 0.48651157 2.05527741
## 6 -0.50923490 -0.60382001
## 7 -0.97982140 0.16185770
## 8 -1.16427039 0.15284506
## 9 -0.43842195 0.43170158
## 10 -0.55393602 -1.29725411
## 11 -0.54416521 1.90327435
## 12 -0.63517797 0.38909594
## 13 0.87986209 -0.52873625
## 14 1.34839913 0.40572716
## 15 0.80240229 -1.10110625
## 16 0.04288574 0.58922798
## 17 0.06466485 -0.85258263
## 18 0.53459312 0.46119208
## 19 -2.64318967 2.08763579
## 20 1.64110298 0.42983730
## 21 -0.73653460 -2.18088752
## 22 0.35407135 -0.92960114
## 23 -1.47532809 0.47123238
## 24 -1.53166747 1.15754997
## 25 1.16034984 -0.93292118
## 26 -0.45183263 -0.07326107
## 27 1.63522063 1.03393445
## 28 1.62734897 1.34271701
## 29 -0.23863300 -0.40280791
## 30 -0.98152254 -1.72588278
## 31 0.92249893 0.44603916
## 32 -0.31238340 -0.19329538
colMeans()
.
colMeans(mtcars2[, c("random1", "random2")])
## random1 random2
## 0.01701839 0.05245791
Try to modify so you get the mean by cylinder instead. Check out the function aggregate()
.
aggregate(mtcars2$random1, by=list(mtcars$cyl), FUN=mean)
## Group.1 x
## 1 4 -0.1194121
## 2 6 0.6076911
## 3 8 -0.1711226
The last data structure that we will explore are lists, which is a very flexible structure. Lists can combine different data structures and they do not have to be of equal dimensions or have other restrictions. The drawback with a flexible structure is that it requires a bit more work to interact with.
The syntax to create a list is similar to creation of the other data structures in R.
l <- list(1, 2, 3)
As with the data frames the str()
command is very useful for the sometimes fairly complex lists instances.
str(l)
## List of 3
## $ : num 1
## $ : num 2
## $ : num 3
This example containing only numeric vector is not very exciting example given the flexibility a list structure offers so let’s create a more complex example.
vec1 <- letters
vec2 <- 1:4
mat1 <- matrix(1:100, nrow = 5)
df1 <- as.data.frame(cbind(10:1, 91:100))
u.2 <- list(vec1, vec2, mat1, df1, l)
As you can see a list can not only contain other data structures, but can also contain other lists.
Looking at the str()
command reveals much of the details of a list
str(u.2)
## List of 5
## $ : chr [1:26] "a" "b" "c" "d" ...
## $ : int [1:4] 1 2 3 4
## $ : int [1:5, 1:20] 1 2 3 4 5 6 7 8 9 10 ...
## $ :'data.frame': 10 obs. of 2 variables:
## ..$ V1: int [1:10] 10 9 8 7 6 5 4 3 2 1
## ..$ V2: int [1:10] 91 92 93 94 95 96 97 98 99 100
## $ :List of 3
## ..$ : num 1
## ..$ : num 2
## ..$ : num 3
With this more complex object, subsetting is slightly trickier than with more the more homogenous objects we have looked at so far.
To look at the first entry of a list one can use the same syntax as for the simpler structures, but note that this will give you a list of length 1 irrespective of the actual type of data structure found.
u.2[1]
str(u.2[1])
## [[1]]
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"
##
## List of 1
## $ : chr [1:26] "a" "b" "c" "d" ...
If one instead wants to extract the list entry as the structure that is stored, one needs to “dig” deeper in the object.
u.2[[1]]
str(u.2[[1]])
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"
## chr [1:26] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" ...
This means that the syntax to extract a specific value from a data structure stored in a list can be daunting. Below we extract the second column of a dataframe stored at position 4 in the list u.2.
u.2[[4]][,2]
## [1] 91 92 93 94 95 96 97 98 99 100
list.2 <- list(vec1 = c("hi", "ho", "merry", "christmas"),
vec2 = 4:19,
mat1 = matrix(as.character(100:81),nrow = 4))
list.2
## $vec1
## [1] "hi" "ho" "merry" "christmas"
##
## $vec2
## [1] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
##
## $mat1
## [,1] [,2] [,3] [,4] [,5]
## [1,] "100" "96" "92" "88" "84"
## [2,] "99" "95" "91" "87" "83"
## [3,] "98" "94" "90" "86" "82"
## [4,] "97" "93" "89" "85" "81"
dfr <- data.frame(letters, LETTERS, letters == LETTERS)
Add this dataframe to the list created above.
list.2[[4]] <- dfr
list.2[-2]
## $vec1
## [1] "hi" "ho" "merry" "christmas"
##
## $mat1
## [,1] [,2] [,3] [,4] [,5]
## [1,] "100" "96" "92" "88" "84"
## [2,] "99" "95" "91" "87" "83"
## [3,] "98" "94" "90" "86" "82"
## [4,] "97" "93" "89" "85" "81"
##
## [[3]]
## letters LETTERS letters....LETTERS
## 1 a A FALSE
## 2 b B FALSE
## 3 c C FALSE
## 4 d D FALSE
## 5 e E FALSE
## 6 f F FALSE
## 7 g G FALSE
## 8 h H FALSE
## 9 i I FALSE
## 10 j J FALSE
## 11 k K FALSE
## 12 l L FALSE
## 13 m M FALSE
## 14 n N FALSE
## 15 o O FALSE
## 16 p P FALSE
## 17 q Q FALSE
## 18 r R FALSE
## 19 s S FALSE
## 20 t T FALSE
## 21 u U FALSE
## 22 v V FALSE
## 23 w W FALSE
## 24 x X FALSE
## 25 y Y FALSE
## 26 z Z FALSE
vec1 <- rnorm(1000)
list.a <- split(vec1, 1:20)
length(list.a)
lapply(list.a, FUN = "length")
## [1] 20
## $`1`
## [1] 50
##
## $`2`
## [1] 50
##
## $`3`
## [1] 50
##
## $`4`
## [1] 50
##
## $`5`
## [1] 50
##
## $`6`
## [1] 50
##
## $`7`
## [1] 50
##
## $`8`
## [1] 50
##
## $`9`
## [1] 50
##
## $`10`
## [1] 50
##
## $`11`
## [1] 50
##
## $`12`
## [1] 50
##
## $`13`
## [1] 50
##
## $`14`
## [1] 50
##
## $`15`
## [1] 50
##
## $`16`
## [1] 50
##
## $`17`
## [1] 50
##
## $`18`
## [1] 50
##
## $`19`
## [1] 50
##
## $`20`
## [1] 50
lapply()
and sapply
are and use both of them with the function summary on your newly created list. What are the pros and cons of the two approaches to calculate the same summary statistics?
lapply(X = list.a, FUN = "summary")
sapply(X = list.a, FUN = "summary")
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.3683 -0.4782 0.0781 0.0595 0.6381 2.2660
##
## $`2`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.33868 -0.78550 0.06743 0.05640 0.79328 2.53786
##
## $`3`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.39560 -0.75838 0.08951 0.12089 1.00574 2.38010
##
## $`4`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.0775 -0.6552 -0.2771 -0.1205 0.4292 2.3249
##
## $`5`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.4736 -0.3782 0.1752 0.1435 0.6121 2.1695
##
## $`6`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.723138 -0.733417 -0.103525 0.006396 0.582746 2.629850
##
## $`7`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.3301 -0.4459 0.2215 0.1650 0.8269 1.9632
##
## $`8`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.38196 -0.94819 -0.04618 -0.05498 0.74134 2.18197
##
## $`9`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.4241 -0.4625 0.1993 0.2125 0.8014 2.4994
##
## $`10`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.82781 -0.63188 -0.14842 -0.00897 0.58044 2.24535
##
## $`11`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.4025 -0.6302 -0.1137 -0.1069 0.5011 1.9112
##
## $`12`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.54703 -0.48070 -0.06611 -0.08685 0.68491 2.19497
##
## $`13`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.53644 -0.43088 0.13923 0.07016 0.55103 2.28019
##
## $`14`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.04788 -0.75911 0.19366 0.04286 0.66156 2.43265
##
## $`15`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.15746 -0.64446 -0.06132 -0.10994 0.43457 2.16547
##
## $`16`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.45847 -0.80817 -0.03905 -0.08744 0.77012 1.69362
##
## $`17`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.03657 -0.74295 -0.03151 0.09679 1.01788 2.32435
##
## $`18`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.54407 -0.94034 -0.04604 -0.12183 0.50687 2.41017
##
## $`19`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.8847313 -0.7548314 0.1245368 -0.0001576 0.6521594 2.4379093
##
## $`20`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.9016 -0.3990 0.2114 0.1931 0.8370 2.3298
##
## 1 2 3 4 5 6
## Min. -2.36832685 -3.33867941 -2.39559577 -2.0775318 -1.4736126 -1.72313815
## 1st Qu. -0.47816773 -0.78550040 -0.75838124 -0.6552028 -0.3782123 -0.73341660
## Median 0.07810308 0.06742578 0.08951032 -0.2770848 0.1751734 -0.10352519
## Mean 0.05949669 0.05639814 0.12088745 -0.1205242 0.1434694 0.00639569
## 3rd Qu. 0.63806542 0.79327596 1.00573549 0.4292251 0.6120980 0.58274564
## Max. 2.26599240 2.53786161 2.38010432 2.3248596 2.1695467 2.62985036
## 7 8 9 10 11 12
## Min. -2.3301395 -2.38195570 -2.4240766 -1.827809637 -2.4024936 -3.54702552
## 1st Qu. -0.4459406 -0.94819488 -0.4625079 -0.631883953 -0.6302401 -0.48070287
## Median 0.2215094 -0.04618142 0.1992922 -0.148420853 -0.1136624 -0.06610789
## Mean 0.1650000 -0.05498193 0.2125476 -0.008969903 -0.1069425 -0.08685141
## 3rd Qu. 0.8269343 0.74134269 0.8013771 0.580438541 0.5010607 0.68490559
## Max. 1.9632437 2.18197140 2.4994430 2.245350987 1.9112174 2.19496835
## 13 14 15 16 17 18
## Min. -2.53643683 -2.04787804 -2.15745881 -2.45846908 -2.03656925 -2.54407140
## 1st Qu. -0.43088352 -0.75911269 -0.64446070 -0.80816910 -0.74294556 -0.94034254
## Median 0.13923281 0.19366192 -0.06131966 -0.03904625 -0.03150553 -0.04603601
## Mean 0.07015853 0.04286108 -0.10994191 -0.08744271 0.09679183 -0.12182987
## 3rd Qu. 0.55102923 0.66155737 0.43456757 0.77012010 1.01788287 0.50687222
## Max. 2.28019111 2.43264581 2.16547133 1.69361821 2.32434733 2.41017474
## 19 20
## Min. -1.8847312788 -1.9015610
## 1st Qu. -0.7548314230 -0.3990480
## Median 0.1245367867 0.2114276
## Mean -0.0001576233 0.1931489
## 3rd Qu. 0.6521594489 0.8369640
## Max. 2.4379092957 2.3298456
Create this hypothetical S3 object in R.
iris
in R. Explore this data set and calculate some useful summary statistics, like SD, mean and median for the parts of the data where this makes sense. Calculate the same statistics for any grouping that you can find in the data.