Object-Oriented Programming Models in R

RaukR 2024 • Advanced R for Bioinformatics

Marcin Kierczak

21-Jun-2024

Object Oriented Programming

  • A programming paradigm – we work with objects.
  • Plato’s cave: ideal objects – classes, reflections – instances.
  • Reflects many real-life systems – easy to model things.

OOP Systems in R

  • S3 – basic R OOP system,
  • S4 – a bit more advanced OOP, common in, e.g. Bioconductor,
  • RC – Reference Classes, the most advanced and close to, e.g. Java OOP, extension of S4,
  • R6 – simplified RC, extension of S3, require(R6)
  • S7 – best of S3 + S4 = S7 require(S7). Still in dev phase.

S4 classes

S4 classes are more advanced than S3 as you actually define the structure of the data within the object of your particular class:

# declare class
setClass('gene', representation(name='character', coords='numeric'))
# instantiate class
my.gene <- new("gene", name = "ANK3", coords = c(1.4e6, 1.412e6))
# declare class
.intragenic_region <- setClass(Class = 'intragenic_region', 
                               slots = c(name='character', coords='numeric'))

# instantiate class
# but since SetClass returns a low-level constructor
my.intra1 <- .intragenic_region(name = 'int1', 
                  coords = c(1.7e6, 1.717e6))

Inheritance in S4

.ext_gene <- setClass(Class = 'ext_gene',contains = "gene",
  slots = c(gene = 'gene', feature_name = 'character', feature_value = 'character')
)
ANK3 <- .ext_gene(name = 'ANK3', coords = c(1.4e6, 1.412e6),
                  feature_name = 'num_introns', feature_value = '5')
str(ANK3)
Formal class 'ext_gene' [package ".GlobalEnv"] with 5 slots
  ..@ gene         :Formal class 'gene' [package ".GlobalEnv"] with 2 slots
  .. .. ..@ name  : chr(0) 
  .. .. ..@ coords: num(0) 
  ..@ feature_name : chr "num_introns"
  ..@ feature_value: chr "5"
  ..@ name         : chr "ANK3"
  ..@ coords       : num [1:2] 1400000 1412000

Sealed Class

Preventing double class definition:

setClass('Not_Sealed')
setClass('Not_Sealed')

But to prevent this:

setClass('Sealed', sealed = T)
setClass('Sealed')
Error in setClass("Sealed"): "Sealed" has a sealed class definition and cannot be redefined

S4 class – slots

The variables within an S4 class are stored in the so-called slots. In the above example, we have 2 such slots: name and coords. Here is how to access them:

my.gene@name # access using @ operator
my.gene@coords[2] # access the 2nd element in slot coords
[1] "ANK3"
[1] 1412000

S4 class – methods

The power of classes lies in the fact that they define both the data types in particular slots and operations (functions) we can perform on them. Let us define a generic print function for an S4 class:

setMethod('print', 'gene', 
          function(x) {
              cat('GENE: ', x@name, ' --> ')
              cat('[', x@coords, ']')
          })
print(my.gene) # and we use the newly defined print
GENE:  ANK3  --> [ 1400000 1412000 ]

S3 Classes

An S3 class object is one of R base types (e.g. integer) with class attribute set:

obj <- factor(c("a", "b", "c"))
typeof(obj)
[1] "integer"
class(obj)
[1] "factor"
attributes(obj)
$levels
[1] "a" "b" "c"

$class
[1] "factor"
str(obj)
 Factor w/ 3 levels "a","b","c": 1 2 3
print(obj)
[1] a b c
Levels: a b c
print(unclass(obj))
[1] 1 2 3
attr(,"levels")
[1] "a" "b" "c"

Custom str Methods

Some S3 classes provide a custom str, e.g.:

time <- strptime("2018-06-07", "%Y-%m-%d")
str(time)
 POSIXlt[1:1], format: "2018-06-07"
str(unclass(time))
List of 11
 $ sec   : num 0
 $ min   : int 0
 $ hour  : int 0
 $ mday  : int 7
 $ mon   : int 5
 $ year  : int 118
 $ wday  : int 4
 $ yday  : int 157
 $ isdst : int 1
 $ zone  : chr "CEST"
 $ gmtoff: int NA
 - attr(*, "tzone")= chr [1:3] "" "CET" "CEST"
 - attr(*, "balanced")= logi TRUE

Generic Methods and Method Dispatch

Have you ever wondered why print() or summary() work on many types (classes) of data?

They are so-called generics, i.e. functions and methods that operate on classes. They know which method to apply to which class thanks to the process of method dispatch.

The naming scheme for generics is: generic.class() i.e. a generic that applies to the class class.

Examples:

  • print.factor(),
  • print.default(),
  • print.data.frame().

To see the code of a method:

getS3method('summary', 'lm') %>% 
  head(n = 5)

Creating S3 Classes

To create an S3 class, simply give a name to a data structure:

gf <- structure(list(), class = 'genomic_features')
class(gf)
[1] "genomic_features"

OR

gf <- list()
class(gf) <- 'genomic_features'
class(gf)
[1] "genomic_features"

You can use some inheritance too:

egf <- list()
class(egf) <- c('genomic_features', 'extended_genomic_features')
class(egf)
[1] "genomic_features"          "extended_genomic_features"

Checking for Correctness

linmod <- with(mtcars, lm(log(mpg) ~ log(disp)))
linmod

Call:
lm(formula = log(mpg) ~ log(disp))

Coefficients:
(Intercept)    log(disp)  
     5.3810      -0.4586  
class(linmod) <- "data.frame"
linmod
Error in `[<-.data.frame`(`*tmp*`, , j, value = c(`(Intercept)` = 5.3809725, : replacement has 2 rows, data has 0

S C A R Y !

Software engineering

Margaret Hamilton, source: Wikimedia Commons

Design patterns

Source: Wikimedia Commons

Source: Wikimedia Commons

R and Design Pattern

  • strategyapply()
  • decoratorsystem_time()
system.time(x <- runif(n = 100000))
x[1:5]
   user  system elapsed 
  0.002   0.000   0.003 
[1] 0.3319210 0.7142757 0.6870337 0.5222928 0.4551353
  • wrapper
my_plot <- function(...) {
  plot(..., col = 'blue', las = 1, cex = .5, pch = 19, cex.axis = .7)
}
  • Singleton see this gist
  • Fluent function interfacetidyverse functions take data x as the very first argument and return object similar to x so that they can be chained by %>%

Even more patterns here.

Safe S3 Design Pattern – Constructor

new_Animal <- function(species, age) {
  stopifnot(is.character(species))
  stopifnot(is.numeric(age))
  
  structure(
    species,
    age = age,
    class = "Animal"
  )
}

Safe S3 Design Pattern – Validator

validate_Animal <- function(x) {
  species <- x[1]
  age <- attr(x, 'age')
  
  if (is.na(species) || species == "") {
    stop('Species name is missing!', call. = FALSE)
  }
  if (!is.numeric(age) || age < 1 || age >= 100) {
    stop("Invalid age!", call. = FALSE)
  }
  return(x)
}

Safe S3 Design Pattern – Helper

Animal <- function(x) {
  species <- x[[1]] 
  age <- x[[2]]
  validate_Animal(new_Animal(species, age))
}
dog <- Animal(list('Canis familiaris', 7))
class(dog)
[1] "Animal"
cat <- Animal(list('Felis felis', '9'))
Error in new_Animal(species, age): is.numeric(age) is not TRUE
cat <- Animal(list('Felis felis', 9))
class(cat)
[1] "Animal"

Building S3 Classes – Styles

One can build an S3 class on top of any existing base type, e.g. a named list:

point_in_space_class <- function(x, y, z) {
  structure(
    list(
      x = x, 
      y = y, 
      z = z
    ),
    class = "point_in_space_class"
  )
}

Introduction to R6 classes

  • require(R6),
  • Do not rely on S4 like RC, but on S3,
  • Are faster than RC,
  • Do not do copy-on-modify,
  • Thus provide OO model similar to C++ or Java.
  • Methods belong to objects, not to generics.

R6 Class Example

library(R6)

Person <- R6Class("Person",
  public = list(
    name = NULL,
    hair = NULL,
    initialize = function(name = NA, hair = NA) {
      stopifnot(is.character(name), is.character(hair))
      self$name <- name
      self$hair <- hair
      self$greet()
    },
    set_hair = function(val) {
      self$hair <- val
    },
    greet = function() {
      cat(paste0("Hello, my name is ", self$name, ".\n"))
    }
  )
)

R6 in Action

kate <- Person$new(name = 'Kate', hair = 'blond')
str(kate)
Hello, my name is Kate.
Classes 'Person', 'R6' <Person>
  Public:
    clone: function (deep = FALSE) 
    greet: function () 
    hair: blond
    initialize: function (name = NA, hair = NA) 
    name: Kate
    set_hair: function (val)  

R6 – methods

kate$greet()
kate$set_hair('red')
kate
Hello, my name is Kate.
<Person>
  Public:
    clone: function (deep = FALSE) 
    greet: function () 
    hair: red
    initialize: function (name = NA, hair = NA) 
    name: Kate
    set_hair: function (val) 

R6 copy-on-modify

kate$hair
[1] "red"
ann <- kate
ann$set_hair('blue')
ann$hair
[1] "blue"
kate$hair
[1] "blue"

R6 – clone()

kate$set_hair('brown')
kate$hair
[1] "brown"
ann <- kate$clone()
kate$hair
ann$hair
[1] "brown"
[1] "brown"
ann$set_hair('blond')
kate$hair
ann$hair
[1] "brown"
[1] "blond"

Thank you! Questions?

         _                    
platform x86_64-apple-darwin20
os       darwin20             
major    4                    
minor    4.0                  

2024 • SciLifeLabNBISRaukR