class: center, middle, inverse, title-slide .title[ # Replication, Control Structures & Functions ] .subtitle[ ## Elements of the R language ] .author[ ### Marcin Kierczak, Nima Rafati, Miguel Redondo ] --- exclude: true count: false <link href="https://fonts.googleapis.com/css?family=Roboto|Source+Sans+Pro:300,400,600|Ubuntu+Mono&subset=latin-ext" rel="stylesheet"> <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.3.1/css/all.css" integrity="sha384-mzrmE5qonljUremFsqc01SB46JvROS7bZs3IO2EmfFsd15uHvIt+Y8vEf7N7fWAU" crossorigin="anonymous"> --- name: contents # Contents of the lecture - variables and their types - operators - vectors - matrices - data frames - lists - **repeating actions: loops** - **decision taking: `if` control structures** - **functions** --- name: repeating_actions_1 # Repeating actions Sometimes you want to repeat certain action several times. There are few alternatives in R, for example: - `for` loop - `while` loop --- name: for_loop_0 # Repeating actions — for loop One way to repeat an action is to use the **for-loop**. This is the general syntax: ``` for (var in seq) { expr } ``` Where: - var = variable that will take values from the sequence - seq= sequence of values - expr = expression to be executed --- name: for_loop_1 # Repeating actions — for loop, an example Example. ``` r for (i in 1:5) { print(paste('Performing operation on no.', i)) } ``` ``` ## [1] "Performing operation on no. 1" ## [1] "Performing operation on no. 2" ## [1] "Performing operation on no. 3" ## [1] "Performing operation on no. 4" ## [1] "Performing operation on no. 5" ``` -- A slight modification of the above example. ``` r for (i in c(2,4,6,8,10)) { print(paste('Performing operation on no.', i)) } ``` ``` ## [1] "Performing operation on no. 2" ## [1] "Performing operation on no. 4" ## [1] "Performing operation on no. 6" ## [1] "Performing operation on no. 8" ## [1] "Performing operation on no. 10" ``` The variable `i` <u> takes values </u> from the sequences. --- name: for_loop_example # Repeating actions — for loop, another example Say, we want to add 1 to every element of a vector: ``` r vec <- c(1:5) vec ``` ``` ## [1] 1 2 3 4 5 ``` ``` r for (i in vec) { vec[i] <- vec[i] + 1 } vec ``` ``` ## [1] 2 3 4 5 6 ``` -- The above can be achieved in R by means of **vectorization**. **Vectorization** is an element-wise operation where you perform an operation on entire vectors. ``` r vec <- c(1:5) vec + 1 ``` ``` ## [1] 2 3 4 5 6 ``` --- name: vectorization_benchmark exclude:true # Repeating actions — vectorization Let us compare the time of execution of the vectorized version (vector with 10,000 elements): ``` r vec <- c(1:1e6) ptm <- proc.time() vec <- vec + 1 proc.time() - ptm # vectorized ``` ``` ## user system elapsed ## 0.002 0.000 0.003 ``` to the loop version: ``` r vec <- c(1:1e6) ptm <- proc.time() for (i in vec) { vec[i] <- vec[i] + 1 } proc.time() - ptm # for-loop ``` ``` ## user system elapsed ## 0.054 0.004 0.058 ``` --- name: for_loop_counter # Repeating actions — for loop with a counter To know the current iteration number on the loop, we can set an external counter: ``` r cnt <- 1 for (i in c(2,4,6,8,10)) { cat(paste('Iteration', cnt, 'Performing operation on no.', i), '\n') cnt <- cnt + 1 } ``` ``` ## Iteration 1 Performing operation on no. 2 ## Iteration 2 Performing operation on no. 4 ## Iteration 3 Performing operation on no. 6 ## Iteration 4 Performing operation on no. 8 ## Iteration 5 Performing operation on no. 10 ``` --- name: loops_avoid_growing # Repeating actions — avoid growing data Avoid changing dimensions of an object inside the loop: ``` r v <- c() # Initialize for (i in 1:100) { v <- c(v, i) } cat(head(v), " ... ", tail(v)) ``` ``` ## 1 2 3 4 5 6 ... 95 96 97 98 99 100 ``` -- It is much better to do it like this: ``` r v <- rep(NA, 100) # Initialize with length for (i in 1:100) { v[i] <- i } cat(head(v), " ... ", tail(v)) ``` ``` ## 1 2 3 4 5 6 ... 95 96 97 98 99 100 ``` -- Always try to know the size of the object you are going to create! --- name: while_loop # Repeating actions — the while loop There is also another type of loop in R, the **while loop** which is executed as long as some condition is true. ``` r x <- 1 while (x < 5) { cat("x equals",x, "\n") x <- x + 1 } ``` ``` ## x equals 1 ## x equals 2 ## x equals 3 ## x equals 4 ``` --- name: recursion exclude: true # Any questions so far? <!-- # Recursion When we explicitely repeat an action using a loop, we talk about **iteration**. We can also repeat actions by means of **recursion**, i.e. when a function calls itself. Let us implement a factorial `\(!\)`: ``` r factorial.rec <- function(x) { if (x == 0 || x == 1) return(1) else return(x * factorial.rec(x - 1)) # Recursive call! } factorial.rec(5) ``` ``` ## [1] 120 ``` # Recursion = iteration? Yes, every iteration can be converted to recursion (Church-Turing conjecture) and vice-versa. It is not always obvious, but theoretically it is doable. Let's see how to implement *factorial* in iterative manner: ``` r factorial.iter <- function(x) { if (x == 0 || x == 1) return(1) else { tmp <- 1 for (i in 2:x) { tmp <- tmp * i } return(tmp) } } factorial.iter(5) ``` ``` ## [1] 120 ``` # Recursion == iteration, really? More writing for the iterative version, right? What about the time efficiency? The recursive version: ``` r ptm <- proc.time() factorial.rec(20) proc.time() - ptm ``` ``` ## [1] 2.432902e+18 ## user system elapsed ## 0.001 0.000 0.001 ``` And the iterative one: ``` r ptm <- proc.time() factorial.iter(20) proc.time() - ptm ``` ``` ## [1] 2.432902e+18 ## user system elapsed ## 0.005 0.000 0.005 ``` --> --- name: if_clause # Decisions, if-clause Often, one has to take a different course of action depending on a flow of the algorithm. In R, we use the `if` clause for this purpose. This is the general syntax: ``` if (condition) { expr } ``` -- A simple example: ``` r temp <- -2 if (temp < 0) { print("It's freezing!") } ``` ``` ## [1] "It's freezing!" ``` --- name: if examples # Decisions, if-clause Two more examples of using `if` inside of a loop: Let's display only the numbers that are greater than 5 in the sequence `\([1, 10]\)` ``` r v <- 1:10 for (i in v) { if (i > 5) { # if clause cat(i, ' ') } } ``` ``` ## 6 7 8 9 10 ``` -- Let's display only odd numbers in the sequence `\([1, 10]\)`: ``` r v <- 1:10 for (i in v) { if (i %% 2 != 0) { # if clause cat(i, ' ') } } ``` ``` ## 1 3 5 7 9 ``` --- name:if_else # Decisions, if-else What if we want to perform an action when the first `if` condition is not met? If we want to print 'o' for an odd number and 'e' for an even, we could write either of: .pull-left-50[ Only `if` clauses ``` r v <- 1:10 for (i in v) { if (i %% 2 != 0) { # if clause cat('o ') } if (i %% 2 == 0) { # another if-clause cat('e ') } } ``` ``` ## o e o e o e o e o e ``` ] -- .pull-right-50[ Using `if-else`: ``` r v <- 1:10 for (i in v) { if (i %% 2 != 0) { # if clause cat('o ') } else { # else clause cat('e ') } } ``` ``` ## o e o e o e o e o e ``` ] --- name: elif exclude: true # Decisions, if-else-if for more alternatives So far, so good, but we were only dealing with 2 alternatives. Let's say that we want to print '?' for zero, 'e' for even and 'o' for an odd number: We can use the **if-else-if** clause for this! ``` r v <- c(0:10) for (i in v) { if (i == 0) { #if clause cat('? ') } else if (i %% 2 != 0) { # else-if clause cat('o ') } else { # else clause cat('e ') } } ``` ``` ## ? o e o e o e o e o e ``` --- name: switch exclude: true # Switch If-else clauses operate on logical values. What if we want to take decisions based on non-logical values? Well, if-else will still work by evaluating a number of comparisons, but we can also use **switch**: ``` r switch.demo <- function(x) { switch(class(x), logical = cat('logical\n'), numeric = cat('Numeric\n'), factor = cat('Factor\n'), cat('Undefined\n') ) } switch.demo(x=TRUE) switch.demo(x=15) switch.demo(x=factor('a')) switch.demo(data.frame()) ``` ``` ## logical ## Numeric ## Factor ## Undefined ``` --- name: fns # Functions Often, it is really handy to re-use some code we have written or to pack together the code that is doing some task. Functions are a really good way to do this in R: This is the general syntax ``` function_name <- function(arg1, arg2, ...) { expr return(something) } ``` -- Let's see a simple example of a function to add one to a number: ``` r add.one <- function(arg1) { result <- arg1 + 1 return(result) } add.one(1) ``` ``` ## [1] 2 ``` --- name: fns_defaults # Functions — arguments with default values Sometimes, it is good to use default values for some arguments: ``` r add.a.num <- function(arg, num=1) { result <- arg + num return(result) } add.a.num(1) # skip the num argument ``` ``` ## [1] 2 ``` -- ``` r add.a.num(1, 5) # overwrite the num argument add.a.num(1, num=5) # overwrite the num argument ``` ``` ## [1] 6 ## [1] 6 ``` -- ``` r add.a.num(num=1) # skip the first argument ``` ``` ## Error in add.a.num(num = 1): argument "arg" is missing, with no default ``` --- name:fns_args # Functions — order of arguments ``` r args.demo <- function(x, y, arg3) { print(paste('x =', x, 'y =', y, 'arg3 =', arg3)) } args.demo(1,2,3) ``` ``` ## [1] "x = 1 y = 2 arg3 = 3" ``` -- ``` r args.demo(x=1, 2, 3) ``` ``` ## [1] "x = 1 y = 2 arg3 = 3" ``` -- ``` r args.demo(x=1, y=2, arg3=3) ``` ``` ## [1] "x = 1 y = 2 arg3 = 3" ``` -- ``` r args.demo(arg3=3, x=1, y=2) ``` ``` ## [1] "x = 1 y = 2 arg3 = 3" ``` --- name: variable_scope # Functions — variable scope .pull-left-50[ Functions 'see' not only what has been passed to them as arguments: ``` r x <- 7 y <- 3 xyplus <- function(x) { x <- x + y return(x) } xyplus(x) x ``` ``` ## [1] 10 ## [1] 7 ``` ] -- .pull-right-50[ Everything outside the function is called **global environment**. There is a special operator `<<-` for working on global environment: ``` r x <- 1 xplus <- function(x) { x <<- x + 1 } xplus(x) x xplus(x) x ``` ``` ## [1] 2 ## [1] 3 ``` ] --- name: fns_ellipsis # Functions — the `...` argument There is a special argument **...** (ellipsis) which allows you to give any number of arguments or pass arguments downstream: ``` r # Any number of arguments my.plot <- function(x, y, ...) { # Passing downstream plot(x, y, las=1, cex.axis=.8, ...) } par(mfrow=c(1,2),mar=c(4,4,1,1)) my.plot(1,1) my.plot(1, 1, col='red', pch=19) ``` <img src="slide_r_elements_4_files/figure-html/fns.3dots-1.png" width="432" style="display: block; margin: auto auto auto 0;" /> - A function enclosing a function is a **wrapper function** --- name: ellipsis_trick exclude:true # Functions — the ellipsis argument trick What if the authors of, e.g. plot.something wrapper forgot about the `...`? ``` r my.plot <- function(x, y) { # Passing downstrem plot(x, y, las=1, cex.axis=.8, ...) } formals(my.plot) <- c(formals(my.plot), alist(... = )) my.plot(1, 1, col='red', pch=19) ``` <img src="slide_r_elements_4_files/figure-html/fns.3dots.trick-1.png" width="360" style="display: block; margin: auto auto auto 0;" /> --- exclude:true <!-- name: lazy_eval # R is lazy! In R, arguments are evaluated as late as possible, i.e. when they are needed. This is **lazy evaluation**: ``` r h <- function(a = 1, b = d) { d <- (a + 1) ^ 2 c(a, b) } #h() ``` > The above won't be possible in, e.g. C where values of both arguments have to be known before calling a function **eager evaluation**. --> --- name: everything_is_a_fn exclude:true # In R everything is a function Because in R everything is a function ``` r `+` ``` ``` ## function (e1, e2) .Primitive("+") ``` we can re-define things like this: ``` r `+` <- function(e1, e2) { e1 - e2 } 2 + 2 ``` ``` ## [1] 0 ``` and, finally, clean up the mess... ``` r rm("+") 2 + 2 ``` ``` ## [1] 4 ``` --- name: infix_fns exclude:true # Infix notation Operators like `+`, `-` or `*` are using the so-called **infix** functions, where the function name is between arguments. We can define our own: ``` r `%p%` <- function(x, y) { paste(x,y) } 'a' %p% 'b' ``` ``` ## [1] "a b" ``` --- name:anatomy_of_a_fn # Anatomy of a function A function consists of: *formal arguments*, *function body* and *environment*: ``` r formals(add.one) ``` ``` ## $arg1 ``` -- ``` r body(add.one) ``` ``` ## { ## result <- arg1 + 1 ## return(result) ## } ``` -- ``` r environment(add.one) environment(sd) ``` ``` ## <environment: R_GlobalEnv> ## <environment: namespace:stats> ``` --- name: base_fns # Base functions When we start R, the following packages are pre-loaded automatically: ``` r # .libPaths() # get library location # library() # see all packages installed search() # see packages currently loaded ``` ``` ## [1] ".GlobalEnv" "package:vcd" ## [3] "package:grid" "package:patchwork" ## [5] "package:nycflights13" "package:readxl" ## [7] "package:formattable" "package:kableExtra" ## [9] "package:manipulateWidget" "package:leaflet" ## [11] "package:yaml" "package:fontawesome" ## [13] "package:bookdown" "package:knitr" ## [15] "package:lubridate" "package:forcats" ## [17] "package:stringr" "package:purrr" ## [19] "package:readr" "package:tidyr" ## [21] "package:tibble" "package:ggplot2" ## [23] "package:tidyverse" "package:reshape2" ## [25] "package:dplyr" "package:stats" ## [27] "package:graphics" "package:grDevices" ## [29] "package:utils" "package:datasets" ## [31] "package:methods" "Autoloads" ## [33] "package:base" ``` Check what basic functions are offered by packages: *base*, *utils* and we will soon work with package *graphics*. If you want to see what statistical functions are in your arsenal, check out package *stats*. <!-- --------------------- Do not edit this and below --------------------- --> --- name: end_slide class: end-slide, middle count: false # See you at the next lecture! .end-text[ <p class="smaller"> <span class="small" style="line-height: 1.2;">Graphics from </span><img src="./assets/freepik.jpg" style="max-height:20px; vertical-align:middle;"><br> Created: 31-Oct-2024 • <a href="https://www.scilifelab.se/">SciLifeLab</a> • <a href="https://nbis.se/">NBIS</a> </p> ]