The objective of this lab is to improve your coding skills by focusing on coding style, code benchmarking and optimization. Below, you will find a number of tasks connected to the topics covered in the Best Coding Practices lecture. Some tasks extend lectures content and require you to find some more information online. Please, note that while we are providing example solutions to many tasks, these are only examples. If you solve a task in a different way it does not matter your solution is wrong. In fact, it may be better than our solution. If in doubt, ask PI for help. We are here for you!
Which of the following are valid/good variable names in R. What is wrong with the ones that are invalid/bad?
var1
, 3way_handshake
, .password
, __test__
, my-matrix-M
, three.dimensional.array
, 3D.distance
, .2objects
, wz3gei92
, next
, P
, Q
, R
, S
, T
, X
, is.larger?
The code below works, but can be improved. Do improve it!
myIterAtoR.max <- 5
second_iterator.max<-7
col.NUM= 10
row.cnt =10
fwzy45 <- matrix(rep(1, col.NUM*row.cnt),nrow=row.cnt)
for(haystack in (2-1):col.NUM){
for(needle in 1:row.cnt) {
if(haystack>=myIterAtoR.max){
fwzy45[haystack, needle]<-NA}
}}
iter_max <- 5
col_num <- 10
row_num <- 10
A <- matrix(rep(1, col_num * row_num), nrow = row_num)
for (i in 1:col_num) {
for (j in 1:row_num) {
if (i >= iter_max) {
A[i, j] <- NA
}
}
}
# Can you improve the code more by eliminating loops or at least one of them?
Improve formatting and style of the following code:
simulate_genotype <- function( q, N=100 ) {
if( length(q)==1 ){
p <- (1 - q)
f_gt <- c(p^2, 2*p*q, q^2) # AA, AB, BB
}else{
f_gt<-q
}
tmp <- sample( c('AA','AB','BB'), size =N, prob=f_gt, replace=T )
return(tmp)
}
simulate_genotype <- function(q, N = 100) {
if (length(q) == 1) {
p <- (1 - q)
f_gt <- c(p^2, 2*p*q, q^2) # AA, AB, BB
} else {
f_gt <- q
}
genotype <- sample(c('AA', 'AB', 'BB'),
size = N,
prob = f_gt,
replace = T)
return(genotype)
}
Modify the function below so that it works with R pipes %>%
:
my_filter <- function(threshold = 1, data, scalar = 5) {
data[data >= threshold] <- NA
data <- data * scalar
return(data)
}
Note you need to have the magrittr
or tidyverse
package loaded in order to be able to use the pipe %>%
!
my_filter <- function(data, threshold = 1, scalar = 5) {
data[data >= threshold] <- NA
data <- data * scalar
return(data)
}
# Test:
c(-5, 5) %>% my_filter()
Is the code below correct? Can it be improved?
simulate_phenotype <- function(pop_params, gp_map, gtype) {
pop_mean <- pop_params[1]
pop_var <- pop_params[2]
pheno <- rnorm(n = N, mean = pop_mean, sd = sqrt(pop_var))
effect <- rep(0, times = length(N))
for (gt_iter in c('AA', 'AB', 'BB')) {
effect[gtype == gt_iter] <- rnorm(n = sum(gtype == gt_iter),
mean = gp_map[gt_iter, 'mean_eff'],
sd = sqrt(gp_map[gt_iter, 'var_eff']))
}
dat <- data.frame(gt = gtype, raw_pheno = pheno, effect = effect, pheno = pheno + effect)
return(dat)
}
Maybe some small improvements can be done, but in principle the code is clean! Except that... the N is not initialized anywhere.
Write a modular code (function or functions) that computes the sample standard deviation given a vector of numbers. Decide how to logically structure the code. Assume there are no built-in R functions for computing mean and variance available. The formula for variance is: \(SD = \sqrt{\frac{\Sigma_{i=1}^{N}(x_i - \bar{x})^2}{N-1}}\). Standard deviation is \(Var=SD^2\).
Hint: consider that you may want to re-use some computed values in future, e.g. variance.
sample_mean <- function(x) {
mean <- sum(x) / length(x)
return(mean)
}
sum_squared_deviations <- function(x) {
tmp <- (x - sample_mean(x)) ^ 2
sum_sq_dev <- sum(tmp)
return(sum_sq_dev)
}
std_dev <- function(x) {
variance <- sqrt(sum_squared_deviations(x) / (length(x) - 1))
return(variance)
}
variance <- function(x) {
return(std_dev(x) ^ 2)
}
You found two functions in two different packages: the randomSampleInt
function that generates a random sample of integer numbers and the randomSampleLetter
function for generating a random sample of letters. Unfortunately, the functions are called in different ways which you want to unify in order to use them interchangeably in your code. Write a wrapper function around the randomSampleLetter
that will provide the same interface to the function as the randomSampleInt
. Also, the randomSampleLetter
cannot handle the seed. Can you add this feature to your wrapper?
randomSampleInt <- function(x, verbose, length, seed = 42) {
if (verbose) {
print(paste0('Generating random sample of ', length, ' integers using seed ', seed))
}
set.seed(seed)
sampleInt <- sample(x = x, size = length, replace = TRUE)
return(sampleInt)
}
randomSampleLetter <- function(N, silent=T, lett) {
if (!silent) {
print(paste0('Generating random sample of ', N, ' letters.'))
}
sample <- sample(x = lett, size = N, replace = TRUE)
return(sample)
}
randomSampleLetterWrapper <- function(x, verbose, length, seed = 42) {
set.seed(seed)
result <- randomSampleLetter(N = length, silent = !verbose, lett = x)
return(result)
}
plot
.Write a wrapper around the graphics::plot
function that modifies its default behaviour so that it plots red crosses instead of black points. Do it in a way that enables the user to modify other function arguments. Hint: you may want to have a look at graphics::plot.default
.
my_plot <- function(x, ...) {
plot(x, pch = 3, color = 'red', ...)
}
What if you want to pass some additional parameters to a function and, sadly, the authors forgot to add ...
to the list of function arguments. There is a way out – you can bind extra arguments supplied as alist
structure to the original function arguments retrieved by formals
. Try to fix the function below, so that the call red_plot(1, 1, col='red', pch=19)
will result in points being represented by red circles. Do use alist
and formals
and do not edit the red_plot
itself! Hint: read help for alist
and formals
. Original function:
red_plot <- function(x, y) {
plot(x, y, las=1, cex.axis=.8, ...)
}
red_plot <- function(x, y) {
plot(x, y, las=1, cex.axis=.8, ...)
}
red_plot(1, 1, col='red', pch=19) # Does not work.
formals(red_plot) <- c(formals(red_plot), alist(... = )) # Fix.
red_plot(1, 1, col='red', pch=19) # Works!
options
.Use options
to change the default prompt in R to hello :-) >
.
Check what options are stored in the hidden variable called Options.
options(prompt = "hello :-) > ")
.Options
options(prompt = "> ") # restoring the default
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] bsplus_0.1.2 fontawesome_0.2.1 captioner_2.2.3 bookdown_0.22
## [5] knitr_1.33
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.6 lubridate_1.7.10 digest_0.6.27 magrittr_2.0.1
## [5] evaluate_0.14 rlang_0.4.11 stringi_1.6.2 generics_0.1.0
## [9] rmarkdown_2.8 tools_4.0.3 stringr_1.4.0 xfun_0.23
## [13] yaml_2.2.1 compiler_4.0.3 htmltools_0.5.1.1
Page built on: 11-Jun-2021 at 14:54:03.
2018 | SciLifeLab > NBIS > RaukR