1 Introduction

There are different data modes used in R. The mode of a variable will for example determine what kind of operators that can be done on it. At the end of this exercise you should know:

  • What are the data types commonly used in R and how to create them
  • Use some basic operators in R
  • Understand how R coerces data if needed
  • Basic text manipulations

1.1 Data types

From the lecture you might remember that all elements in any data stuctures found in R will be of a certain type (or have a certain mode). The four most commonly used data types in R are: logical, integer, double (often called numeric), and character. The names hints at what they are.

  • Logical = TRUE or FALSE (or NA)
  • Integer = Numbers that can be represented without fractional component
  • Numeric = Any number that is not a complex number.
  • Character = Text

In many cases the mode of on entry is determined by the content so if you save the value 5.1 as a variable in R, the variable will automatically be recognised as numeric. If you instead have a text string like “hello world” it will have the mode character. Below you will also see examples of how you can specify the mode and not rely on R inferring the right mode based on content.

2 Exercises

In all exercises during this course, it is important that you try to figure out what the expected result would be, prior to running the commands. You should then verify that this will indeed be the result by running the command. In case there is a discrepency between your expectations and the actual output make sure you understand why before you move forward. If you cannot figure out how to, or which command to run you can click the key to reveal example code including expected output. If you are trying out things on your own have a hard time understanding what is going on, ask the TAs or your someone sitting next to you who might have wrapped their head around the issue.

Also note that in many cases there multiple solutions that solve the problem equally well.

We do recommend to write all code in a Rmarkdown document in R-studio as that will at the end of the course be your own R tutorial with comments and code solutions.

2.1 Working with variables

Open Rstudio and make sure to set your working directory. Double check that you do not have stored objects in your current session with the following command. This will list all objects that you have in your current R session.

ls()
##  [1] "A"             "address"       "asst"          "B"            
##  [5] "car.names"     "char_vec"      "cities"        "cl"           
##  [9] "cn"            "cn_vec"        "cnames"        "cntr"         
## [13] "color_primary" "color_text"    "cor_result"    "crashes"      
## [17] "df"            "df1"           "dfr"           "doctor"       
## [21] "drinks"        "e"             "E"             "E.means"      
## [25] "E.medians"     "E.mm"          "eng_output"    "fa"           
## [29] "fa1"           "fa2"           "gapminder"     "i"            
## [33] "l"             "lego"          "letnum"        "linkoping"    
## [37] "list.2"        "list.a"        "loc"           "lund"         
## [41] "mat1"          "mt.merged"     "mtcars2"       "numeric_vec"  
## [45] "p"             "p1"            "p2"            "penguins"     
## [49] "r"             "random1"       "random2"       "s"            
## [53] "sel_cn"        "sharks"        "stackoverflow" "tmp_cn"       
## [57] "tmp_count"     "tmp_data"      "tmp_df"        "tmp_max"      
## [61] "tmp_mean"      "tmp_min"       "tmp_missing"   "tmp_sd"       
## [65] "tmp_var"       "u.2"           "umea"          "uppsala"      
## [69] "uppsala2"      "vars"          "vec1"          "vec2"         
## [73] "vector1"       "vector2"       "vector3"       "videogames"   
## [77] "X"             "X.2"

In case you have objects that you want to remove from the current session you can do so with the rm() function. This command will remove all objects available in your current environment.

rm(list = ls())

This command uses commands that we have not talked about yet. If you do not understand how it works now, you will do so after tomorrows lectures and exercises.

  1. Create variables var1 and var2 and initialize them with two integers of choice.

var1 <- 11
var2 <- 34
  1. Add the two variables and save them as a new variable named var3 and print the result.

var3 <- var1 + var2
var3
## [1] 45
  1. Check the class, mode, and type for var1, var2, var3 and π (is found under the variable name pi in R)

mode(var1)
class(var1)
typeof(var1)
## [1] "numeric"
## [1] "numeric"
## [1] "double"

mode(pi)
class(pi)
typeof(pi)
## [1] "numeric"
## [1] "numeric"
## [1] "double"
  1. Create two character variables containing a text of choice. Check the mode, class, and type of the first one.

text1 <- "test1"
text2 <- "test2"

mode(text1)
class(text1)
typeof(text1)
## [1] "character"
## [1] "character"
## [1] "character"

Add var1 to it. What is the result and why?

text1+var1
## Error in text1 + var1: non-numeric argument to binary operator
  1. Convert var3 to an integer, cast an integer variable to double, cast a string to a double.

as.integer(var3)
i <- 175
as.double(i)
as.double(text1)
## [1] 45
## [1] 175
## [1] NA
  1. Report floor and ceiling of π and round π to 3 decimal places.

floor(pi)
ceiling(pi)
round(pi, digits=3)
## [1] 3
## [1] 4
## [1] 3.142
  1. Is floor of π an integer?

is.integer(floor(pi))
## [1] FALSE
  1. Treat "3.56437" string as number.

as.numeric('3.56437')
## [1] 3.56437
  1. Divide ∞ by - ∞

Inf/-Inf
## [1] NaN
  1. Create two freely chosen complex numbers.
  • Check that they are complex indeed.
  • Add, multiply and divide one by another.
  • Add an integer to their sum.

c1 <- 23 + 4i
c2 <- -15 - 7i
is.complex(c1)
is.complex(c2)
c1 + c2
c1 / c2
c1 + c2 + 7
## [1] TRUE
## [1] TRUE
## [1] 8-3i
## [1] -1.361314+0.3686131i
## [1] 15-3i
  1. Print a truth table for OR (for three distinct logical values). Read about truth tables here.

x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "|")
##       <NA> FALSE TRUE
## <NA>    NA    NA TRUE
## FALSE   NA FALSE TRUE
## TRUE  TRUE  TRUE TRUE
  1. Multiply a logical TRUE by a logical FALSE. Rise the logical true to the 7-th power.

TRUE * FALSE
T^7
## [1] 0
## [1] 1
  1. Create two character variables containing two verses of your favorite song.
  • Concatenate the two variables,
  • Paste the variables with ‘*’ as separator.
  • Find if ‘and’ occurs in the second line,
  • Substitute a word for another,
  • Extract substring starting at the 5th character and 5 characters long.

line1 <- "Hello darkness my old friend"
line2 <- "I've come to talk to you again"
paste(line1, line2, sep = "")
paste(line1, line2, sep = "*")
grep('and', line2)
sub('Hello', 'Goodbye', line1)
substr(line1, 5, 5 + 5)
## [1] "Hello darkness my old friendI've come to talk to you again"
## [1] "Hello darkness my old friend*I've come to talk to you again"
## integer(0)
## [1] "Goodbye darkness my old friend"
## [1] "o dark"

2.2 R Environment

  • Get help for the t.test, table, locator and identify functions,
  • Check for all occurences of fisher.test in the docs,
  • Which package contains the plot.ecdf function. What does it do?
  • Find package ‘reshape’-related questions on StackOverflow,
  • Search on the internet on how to load an XML file into R,
  • Install the ‘cgmisc’ package from GitHub,
  • Look up the ‘cgmisc’ vignette,
  • See all the demos available for you and run one you like,
  • Run examples for the fisher.test,
  • Check out CRANs view for genetics,
  • Install a CRAN package of choice,
  • Install the R-Forge package ‘bigRR’