Debugging, Profiling, and Optimization

RaukR 2023 • Advanced R for Bioinformatics

Marcin Kierczak

27-Jun-2023

Run Forrest, run!




  • My code does not run!debugging

  • Now it does run but… out of memory!profiling

  • It runs! It says it will finish in 5 minutes years.optimization

Types of bugs

  • 🔣 Syntax errors
prin(var1) 
mean(sum(seq((x + 2) * (y - 9 * b)))
  • 🔢 Arithmetic
y <- 7 / 0

Not in R though! y = Inf

  • 🍎🍊 Type
mean('a')
  • 🧩 Logic

Everything works and produces seemingly valid output that is WRONG!
IMHO those are the hardest 💀 to debug!

How to avoid bugs




  • Encapsulate your code in smaller units 🍱 (functions), you can test.

  • Use classes and type checking 🆗.

  • Test 🧪 at the boundaries, e.g. loops at min and max value.

  • Feed your functions with test data 💾 that should result with a known output.

  • Use antibugging 🕸: stopifnot(y <= 75)

Floating confusion



(vec <- seq(0.1, 0.9, by=0.1))
vec == 0.7 
vec == 0.5
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[1] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE


(0.5 + 0.1) - 0.6
(0.7 + 0.1) - 0.8 
[1] 0
[1] -1.110223e-16

💀 Beware of floating point arithmetic! 💀

How to float 🏊



round((0.7 + 0.1) , digits = 2) - 0.8
[1] 0

Comparing floating point numbers:

(vec <- seq(0.1, 0.9, by=0.1))
vec == 0.7
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
epsilon <- 0.001
abs(vec - 0.7) <= epsilon
[1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE

Final thoughts on floating

head(unlist(.Machine))
    double.eps double.neg.eps    double.xmin    double.xmax    double.base 
  2.220446e-16   1.110223e-16  2.225074e-308  1.797693e+308   2.000000e+00 
 double.digits 
  5.300000e+01 
head(unlist(.Platform))
   OS.type   file.sep dynlib.ext        GUI     endian    pkgType 
    "unix"        "/"   ".dylib"  "unknown"   "little"   "source" 

Handling Errors

Let us generate some errors:

input <- c(1, 10, -7, -2/5, 0, 'char', 100, pi, NaN)
for (val in input) {
  (paste0('Log of ', val, 'is ', log10(val)))
}
Error in log10(val): non-numeric argument to mathematical function





So, how to handle this mess?

Handling Errors – try



try(
  print(
    paste0('Log of ', input, ' is ', log10(as.numeric(input)))
  )
)
[1] "Log of 1 is 0"                               
[2] "Log of 10 is 1"                              
[3] "Log of -7 is NaN"                            
[4] "Log of -0.4 is NaN"                          
[5] "Log of 0 is -Inf"                            
[6] "Log of char is NA"                           
[7] "Log of 100 is 2"                             
[8] "Log of 3.14159265358979 is 0.497149872694133"
[9] "Log of NaN is NaN"                           

Handling Errors – tryCatch block:

result <- tryCatch(log10(val), 
            warning = function(w) { 
              print('Warning! Negative argument supplied. Negating.') 
              log10(-val) }, 
            error = function(e) { 
              print('ERROR! Not a number!')
              NaN
            }
          )
[1] "Log of 1 is 0"
[1] "Log of 10 is 1"
[1] "Warning! Negative argument supplied. Negating."
[1] "Log of -7 is 0.845098040014257"
[1] "Warning! Negative argument supplied. Negating."
[1] "Log of -0.4 is -0.397940008672038"
[1] "Log of 0 is -Inf"
[1] "Log of NA is NA"
[1] "Log of 100 is 2"
[1] "Log of 3.14159265358979 is 0.497149872694133"
[1] "Log of NaN is NaN"

Debugging – errors and warnings

  • An error in your code will result in a call to the stop() function that:
    • Breaks the execution of the program (loop, if-statement, etc.)
    • Performs the action defined by the global parameter error.
  • A warning just prints out the warning message (or reports it in another way)
  • Global parameter error defines what R should do when an error occurs.
options(error = )
  • You can use simpleError() and simpleWarning() to generate errors and warnings in your code:
f <- function(x) {
  if (x < 0) {
    x <- abs(x)
    w <- simpleWarning("Value less than 0. Taking abs(x)")
    w
  }
}

Debugging – what are my options?

  • Old-school debugging: a lot of print statements
    • print values of your variables at some checkpoints,
    • sometimes fine but often laborious,
    • need to remove/comment out manually after debugging.
  • Dumping frames
    • on error, R state will be saved to a file,
    • file can be read into debugger,
    • values of all variables can be checked,
    • can debug on another machine, e.g. send dump to your colleague!
  • Traceback
    • a list of the recent function calls with values of their parameters
  • Step-by-step debugging
    • execute code line by line within the debugger

Option 1: dumping frames

f <- function(x) { sin(x) }
options(error = quote(dump.frames(dumpto = "assets/testdump", to.file = T)))
f('test')
options(error = NULL) # reset the behavior
load('assets/testdump.rda')
# debugger(testdump)

Hint: Last empty line brings you back to the environments menu.

Option 2: traceback

f <- function(x) { 
  log10(x) 
}
g <- function(x) { 
  f(x) 
}
g('test')
Error in log10(x): non-numeric argument to mathematical function
> traceback()
2: f(x) at #2
1: g("test")

traceback() shows what were the function calls and what parameters were passed to them when the error occurred.

Option 3: step-by-step debugging

Let us define a new function h(x, y):

h <- function(x, y) { 
  f(x) 
  f(y) 
}

Now, we can use debug() to debug the function in a step-by-step manner:

debug(h)
h('text', 7)
undebug(h)

Profiling – proc.time()

Profiling is the process of identifying memory and time bottlenecks 🍶 in your code.

proc.time()
   user  system elapsed 
  1.426   0.147   2.113 
  • user time – CPU time charged for the execution of user instructions of the calling process,
  • system time – CPU time charged for execution by the system on behalf of the calling process,
  • elapsed time – total CPU time elapsed for the currently running R process.
pt1 <- proc.time()
tmp <- runif(n =  10e5)
pt2 <- proc.time()
pt2 - pt1
   user  system elapsed 
  0.006   0.001   0.007 

Profiling – system.time()

system.time(runif(n = 10e6))
system.time(rnorm(n = 10e6))
   user  system elapsed 
  0.125   0.006   0.132 
   user  system elapsed 
  0.230   0.005   0.235 

An alternative approach is to use tic and toc statements from the tictoc package.

library(tictoc)
tic()
tmp1 <- runif(n = 10e6)
toc()
0.062 sec elapsed

Profiling in action

These 4 functions fill a large vector with values supplied by function f.

1 – loop without memory allocation.

fun_fill_loop1 <- function(n = 10e6, f) {
  result <- NULL
  for (i in 1:n) {
    result <- c(result, eval(call(f, 1)))
  }
  return(result)
}

2 – loop with memory allocation.

fun_fill_loop2 <- function(n = 10e6, f) {
  result <- vector(length = n)
  for (i in 1:n) {
    result[i] <- eval(call(f, 1))
  }
  return(result)
}

Profiling in action cted.

But it is maybe better to use…

vectorization!

3 – vectorized loop without memory allocation.

fun_fill_vec1 <- function(n = 10e6, f) {
  result <- NULL
  result <- eval(call(f, n))
  return(result)
}

4 – vectorized with memory allocation.

fun_fill_vec2 <- function(n = 10e6, f) {
  result <- vector(length = n)
  result <- eval(call(f, n))
  return(result)
}

Profiling our functions


p1 <- system.time(fun_fill_loop1(n = 10e4, "runif")) # 1 - loop, no alloc
p2 <- system.time(fun_fill_loop2(n = 10e4, "runif")) # 2 - loop, alloc 
p3 <- system.time(fun_fill_vec1(n = 10e4, "runif"))  # 3 - vector, no alloc
p4 <- system.time(fun_fill_vec2(n = 10e4, "runif"))  # 4 - vector, alloc
fn user.self sys.self elapsed
fn1 7.007 1.563 8.570
fn2 0.262 0.015 0.276
fn3 0.001 0.000 0.001
fn4 0.001 0.000 0.000

The system.time() function is not the most accurate though. During the lab, we will experiment with package microbenchmark.

More advanced profiling

We can also do a bit more advanced profiling, including the memory profiling, using, e.g. Rprof() function.

Rprof('profiler_test.out', interval = 0.01, memory.profiling = T)
for (i in 1:5) {
  result <- fun_fill_loop2(n = 10e4, "runif")
  print(result)
}
Rprof(NULL)

And let us summarise:

summary <- summaryRprof('profiler_test.out', memory='both')
knitr::kable(summary$by.self)
self.time self.pct total.time total.pct mem.total
“print.default” 2.84 59.92 2.84 59.92 2.8
“eval” 0.82 17.30 1.64 34.60 703.2
“runif” 0.63 13.29 0.63 13.29 275.9
“fun_fill_loop2” 0.24 5.06 1.89 39.87 872.6
“parent.frame” 0.08 1.69 0.08 1.69 48.5
“is.list” 0.04 0.84 0.04 0.84 10.5
“is.pairlist” 0.04 0.84 0.04 0.84 0.0
“baseenv” 0.03 0.63 0.03 0.63 6.4
“close.connection” 0.01 0.21 0.01 0.21 0.0
“findCenvVar” 0.01 0.21 0.01 0.21 1.5

Profiling – profr package

There are also packages available that enable even more advanced profiling:

library(profr)
Rprof("profiler_test2.out", interval = 0.01)
tmp <- table(sort(rnorm(1e5)))
Rprof(NULL)
profile_df <- parse_rprof('profiler_test2.out')

This returns a table that can be visualised:

level g_id t_id f start end n leaf time source
1 1 1 table 0.00 0.46 1 FALSE 0.46 base
1 2 1 cb 0.46 0.48 1 FALSE 0.02 NA
2 1 1 sort 0.00 0.04 1 FALSE 0.04 base
2 2 1 factor 0.04 0.46 1 FALSE 0.42 base
2 3 1 tryCatch 0.46 0.48 1 FALSE 0.02 base
3 1 1 rnorm 0.00 0.02 1 TRUE 0.02 stats
3 2 1 sort.default 0.02 0.04 1 FALSE 0.02 base
3 3 1 unique 0.04 0.28 1 FALSE 0.24 base
3 4 2 tryCatchList 0.46 0.48 1 FALSE 0.02 NA
4 1 1 sort.int 0.02 0.04 1 FALSE 0.02 base
4 2 1 unique.default 0.04 0.28 1 TRUE 0.24 base
4 3 2 tryCatchOne 0.46 0.48 1 FALSE 0.02 NA
5 1 1 order 0.02 0.04 1 TRUE 0.02 base
5 2 2 doTryCatch 0.46 0.48 1 FALSE 0.02 NA
6 1 1 inspect_env 0.46 0.48 1 FALSE 0.02 NA
7 1 1 lapply 0.46 0.48 1 FALSE 0.02 base
8 1 1 FUN 0.46 0.48 1 FALSE 0.02 NA
9 1 1 scalar 0.46 0.48 1 FALSE 0.02 NA
10 1 1 trimws 0.46 0.48 1 FALSE 0.02 base
11 1 1 mysub 0.46 0.48 1 FALSE 0.02 NA
12 1 1 sub 0.46 0.48 1 FALSE 0.02 base
13 1 1 is.factor 0.46 0.48 1 FALSE 0.02 base
14 1 1 mysub 0.46 0.48 1 FALSE 0.02 NA
15 1 1 sub 0.46 0.48 1 FALSE 0.02 base
16 1 1 is.factor 0.46 0.48 1 FALSE 0.02 base
17 1 1 try_capture_str 0.46 0.48 1 FALSE 0.02 NA
18 1 1 tryCatch 0.46 0.48 1 FALSE 0.02 base
19 1 1 tryCatchList 0.46 0.48 1 FALSE 0.02 NA
20 1 1 tryCatchOne 0.46 0.48 1 FALSE 0.02 NA
21 1 1 doTryCatch 0.46 0.48 1 FALSE 0.02 NA
22 1 1 capture_str 0.46 0.48 1 FALSE 0.02 NA
23 1 1 paste0 0.46 0.48 1 FALSE 0.02 base
24 1 1 utils::capture.output 0.46 0.48 1 FALSE 0.02 NA
25 1 1 withVisible 0.46 0.48 1 FALSE 0.02 base
26 1 1 …elt 0.46 0.48 1 FALSE 0.02 NA
27 1 1 utils::str 0.46 0.48 1 FALSE 0.02 NA
28 1 1 str.default 0.46 0.48 1 FALSE 0.02 NA
29 1 1 cat 0.46 0.48 1 FALSE 0.02 base
30 1 1 formObj 0.46 0.48 1 FALSE 0.02 NA
31 1 1 maybe_truncate 0.46 0.48 1 FALSE 0.02 NA
32 1 1 nchar.w 0.46 0.48 1 FALSE 0.02 NA
33 1 1 paste 0.46 0.48 1 FALSE 0.02 base
34 1 1 format.fun 0.46 0.48 1 FALSE 0.02 NA
35 1 1 format 0.46 0.48 1 FALSE 0.02 base
36 1 1 format.default 0.46 0.48 1 FALSE 0.02 base
37 1 1 prettyNum 0.46 0.48 1 FALSE 0.02 base
38 1 1 is.na 0.46 0.48 1 TRUE 0.02 base

Profiling – profr package cted.

We can also plot the results using – proftools package-

library(Rgraphviz)
library(proftools) # depends on "graph" and "Rgraphviz" packages
profile_df2 <- readProfileData("profiler_test2.out")
plotProfileCallGraph(profile_df2, style = google.style, score = "total")

Profiling with profvis

Yet another nice way to profile your code is by using Hadley Wickham’s profvis package:

library(profvis)
profvis({fun_fill_loop2(1e4, 'runif')
  fun_fill_vec2(1e4, 'runif')
})

Profiling with profvis cted.

Optimizing your code

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be deluded into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.

– Donald Knuth


source: http://www.xkcd/com/1319


source: http://www.xkcd/com/1205

Ways to optimize the code

  • write it in a more efficient way, e.g. use vectorization or *apply family instead of loops etc.,
  • allocating memory to avoid copy-on-modify,
  • use package BLAS for linear algebra,
  • use bigmemory package,
  • GPU computations,
  • multicore support, e.g. multicore, snow
  • use futures
  • use data.table or tibble instead of data.frame

Copy-on-modify

library(pryr)
order <- 1024
matrix_A <- matrix(rnorm(order^2), nrow = order)
matrix_B <- matrix_A

Check where the objects are in the memory:

address(matrix_A)
address(matrix_B)
[1] "0x118000000"
[1] "0x118000000"

What happens if we modify a value in one of the matrices?

matrix_B[1,1] <- 1
address(matrix_A)
address(matrix_B)
[1] "0x118000000"
[1] "0x118804000"

Avoid copying by allocating memory

No memory allocation

f1 <- function(to = 3, silent=F) {
  tmp <- c()
  for (i in 1:to) {
    a1 <- address(tmp)
    tmp <- c(tmp, i)
    a2 <- address(tmp)
    if (!silent) { print(paste0(a1, " --> ", a2)) } 
  }
}
f1()
[1] "0x15a80f4e0 --> 0x13aab2158"
[1] "0x13aab2158 --> 0x13aaa05c8"
[1] "0x13aaa05c8 --> 0x159317708"

Avoid copying by allocating memory cted.

With memory allocation

f2 <- function(to = 3, silent = FALSE) {
  tmp <- vector(length = to, mode='numeric')
  for (i in 1:to) {
    a1 <- address(tmp)
    tmp[i] <- i
    a2 <- address(tmp)
    if(!silent) { print(paste0(a1, " --> ", a2)) }
  }
}
f2()
[1] "0x129363bb8 --> 0x129363bb8"
[1] "0x129363bb8 --> 0x129363bb8"
[1] "0x129363bb8 --> 0x129363bb8"

Allocating memory – benchmark.

library(microbenchmark)
benchmrk <- microbenchmark(f1(to = 1e3, silent = T), 
                           f2(to = 1e3, silent = T), 
                           times = 100L)
ggplot2::autoplot(benchmrk)

GPU

A = matrix(rnorm(1000^2), nrow=1000) # stored: RAM, computed: CPU
B = matrix(rnorm(1000^2), nrow=1000) 
gpuA = gpuMatrix(A, type = "float") # stored: RAM, computed: GPU
gpuB = gpuMatrix(B, type = "float")
vclA = vclMatrix(A, type = "float") # stored: GPU, computed: GPU
vclB = vclMatrix(B, type = "float")
bch <- microbenchmark(
  cpu_ram = A %*% B,
  gpu_ram = gpuA %*% gpuB,
  gpu_vcl = vclA %*% vclB, 
  times = 10L) 

More on Charles Determan’s Blog.

GPU cted.

ggplot2::autoplot(bch)

Parallelization using package parallel

Easiest to parallelize is lapply:

result <- lapply(1:2, function(x) { c(x, x^2, x^3) })
result
[[1]]
[1] 1 1 1

[[2]]
[1] 2 4 8
library(parallel)
num_cores <- detectCores() - 1
cl <- makeCluster(num_cores) # Init cluster
parLapply(cl, 1:2, function(x) { c(x, x^2, x^3)} )
stopCluster(cl)
[[1]]
[1] 1 1 1

[[2]]
[1] 2 4 8

Thank you! Questions?

         _                         
platform aarch64-apple-darwin20.0.0
os       darwin20.0.0              
major    4                         
minor    2.2                       

2023 • SciLifeLabNBISRaukR