In programming languages loop structures, either with or without conditions, are used to repeat commands over multiple entities. For and while loops as well as if-else statements are also often used in R, but not as often as in many other programming languages. The reason for this is that many needs of the loops are addressed using vectorization or via apply functions.
This means that we can multiply all values in a vector in R by two by calling
vec.a <- c(1, 2, 3, 4)
vec.a * 2
## [1] 2 4 6 8
In many other and languages as well as in R, you can also create this with a loop instead
for (i in vec.a) {
vec.a[i] <- vec.a[i] * 2
}
vec.a
## [1] 2 4 6 8
This is far less efficient and not by any means easier to type and we hence tend to avoid loops when possible.
Let us compare the time of execution of the vectorized version (vector with 10,000 elements):
vec <- c(1:1e6)
ptm <- proc.time()
vec <- vec + 1
proc.time() - ptm # vectorized
## user system elapsed
## 0.001 0.000 0.002
–
to the loop version:
vec <- c(1:1e6)
ptm <- proc.time()
for (i in vec) {
vec[i] <- vec[i] + 1
}
proc.time() - ptm # for-loop
## user system elapsed
## 0.059 0.000 0.059
After this exercise you should know:
apply()
function to calculate row sums as well as with the built-in rowSums()
function. These functions were discussed in the lecture Elements of the programming language - part 2.
X <- matrix(1:1000000, nrow = 100000, ncol = 10)
for.sum <- vector()
# Note that this loop is much faster if you outside the loop create an empty vector of the right size.
# rwmeans <- vector('integer', 100000)
for (i in 1:nrow(X)) {
for.sum[i] <- sum(X[i,])
}
head(for.sum)
## [1] 4500010 4500020 4500030 4500040 4500050 4500060
app.sum <- apply(X, MARGIN = 1, sum)
head(app.sum)
## [1] 4500010 4500020 4500030 4500040 4500050 4500060
rowSums.sum <- rowSums(X)
head(rowSums.sum)
## [1] 4500010 4500020 4500030 4500040 4500050 4500060
identical(for.sum, app.sum)
## [1] TRUE
identical(for.sum, rowSums.sum)
## [1] FALSE
identical(for.sum, as.integer(rowSums.sum))
## [1] TRUE
x <- 1
while.sum <- vector("integer", 100000)
while (x < 100000) {
while.sum[x] <- sum(X[x,])
x <- x + 1
}
head(while.sum)
## [1] 4500010 4500020 4500030 4500040 4500050 4500060
nchar
function.
vector1 <- 1:10
vector2 <- c("Odd", "Loop", letters[1:8])
vector3 <- rnorm(10, sd = 10)
dfr1 <- data.frame(vector1, vector2, vector3, stringsAsFactors = FALSE)
sum.vec <- vector()
for(i in 1:ncol(dfr1)) {
if (is.numeric(dfr1[,i])) {
sum.vec[i] <- sum(dfr1[,i])
}
if (is.character(dfr1[,i])) {
sum.vec[i] <- sum(nchar(dfr1[,i]))
}
}
sum.vec
## [1] 55.000000 15.000000 4.655531
sum.vec <- vector()
for(i in 1:ncol(dfr1)) {
if (is.numeric(dfr1[,i])) {
sum.vec[i] <- sum(dfr1[,i])
} else {
sum.vec[i] <- sum(nchar(dfr1[,i]))
}
}
sum.vec
## [1] 55.000000 15.000000 4.655531
dfr.info <- function(dfr) {
sum.vec <- vector()
for (i in 1:ncol(dfr)) {
if (is.numeric(dfr[,i])) {
sum.vec[i] <- sum(dfr[,i])
} else {
sum.vec[i] <- sum(nchar(dfr[,i]))
}
}
sum.vec
}
#Execute the function
dfr.info(dfr1)
## [1] 55.000000 15.000000 4.655531
TRUE
s when is logical and the total number of characters if it is a character vector.
nchar
function. TRUE
values, you can use sum
function.
vector1 <- 1:10
vector2 <- c("Odd", "Loop", letters[1:8])
vector3 <- c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE)
dfr2 <- data.frame(vector1, vector2, vector3, stringsAsFactors = FALSE)
sum.vec <- vector()
for(i in 1:ncol(dfr2)) {
if (is.numeric(dfr2[,i])) {
sum.vec[i] <- sum(dfr2[,i])
} else if (is.logical(dfr2[,i])) {
sum.vec[i] <- sum(dfr2[,i])
} else {
sum.vec[i]<-sum(nchar(dfr2[,i]))
}
}
sum.vec
## [1] 55 15 6
stop()
function by typing ?stop
in the R console.
hours_to_mins <- function(hours) {
if (hours < 0) {
stop("Hours cannot be negative")
}
minutes <- hours * 60
return(minutes)
}
hours_to_mins(3.2)
#Now test it with a negative hour value
hours_to_mins(-3.26)
## Error in hours_to_mins(-3.26): Hours cannot be negative
## [1] 192
Do you want to expand on loops, if-else clauses, and functions? Here a bit more extra material!
If-else clauses operate on logical values. What if we want to take decisions based on non-logical values? Well, if-else will still work by evaluating a number of comparisons, but we can also use switch:
switch.demo <- function(x) {
switch(class(x),
logical = cat('logical\n'),
numeric = cat('Numeric\n'),
factor = cat('Factor\n'),
cat('Undefined\n')
)
}
switch.demo(x=TRUE)
switch.demo(x=15)
switch.demo(x=factor('a'))
switch.demo(data.frame())
## logical
## Numeric
## Factor
## Undefined
What if the authors of, e.g. plot.something wrapper forgot about the ...
?
my.plot <- function(x, y) { # Passing downstrem
plot(x, y, las=1, cex.axis=.8, ...)
}
formals(my.plot) <- c(formals(my.plot), alist(... = ))
my.plot(1, 1, col='red', pch=19)
Operators like +
, -
or *
are using the so-called infix functions, where the function name is between arguments. We can define our own:
`%p%` <- function(x, y) {
paste(x,y)
}
'a' %p% 'b'
## [1] "a b"