Reticulate

RaukR 2023 • Advanced R for Bioinformatics

Nina Norgren

27-Jun-2023

Learning outcomes


In this session we will learn to:

  • Understand the concepts needed for running Python in R
  • Understand the different object classes in Python and their equivalent in R
  • Apply our knowledge to:
    • Import Python functions into R
    • Use R objects as input to Python functions
    • Translate between Python and R objects

Introduction



R versus Python The ultimate fight!




Not anymore!

Introducing reticulate

  • Combine Python and R code
  • Use R classes in Python functions and vice versa
  • Import Python functions into R code and run from R
  • Add Python code chunks to markdown documents
library(reticulate)

Importing Python modules

datetime <- import("datetime")
todays_r_date <- datetime$datetime$now()
todays_r_date
class(todays_r_date)
[1] "2023-08-04 11:00:22 UTC"
[1] "POSIXct" "POSIXt" 

Objects are automatically converted to R types, unless otherwise specified

datetime <- import("datetime", convert = FALSE)
todays_py_date <- datetime$datetime$now()
todays_py_date
class(todays_py_date)
datetime.datetime(2023, 8, 4, 11, 0, 22, 351864)
[1] "datetime.datetime"     "datetime.date"         "python.builtin.object"

Importing built-in Python functions

Access Python’s built-in functions directly in R

builtins <- import_builtins()
r_vec <- c(1, 5, 3, 4, 2, 2, 3, 2)
str(r_vec)
 num [1:8] 1 5 3 4 2 2 3 2

r_vec is an R object.

builtins$len(r_vec); builtins$max(r_vec)
[1] 8
[1] 5

Python built-in functions still working on R objects

max(r_vec)
[1] 5

Normal R way

Sourcing scripts

Import your own python functions for use in R. File python_functions.py:

def add(x, y):
  return x + y

R code:

source_python("python_functions.py")
class(4)
res <- add(4,5)
res
class(res)
[1] "numeric"
[1] 9
[1] "numeric"

Type numeric in and type numeric out. But what happens in between?

Sourcing scripts

But what happens in between?

File python_functions.py:

def add_with_print(x, y):
  print(x, 'is of the python type ', type(x))
  return x + y
res2 <- add_with_print(4,5)
py_capture_output(add_with_print(4,5))
str(res2)
[1] "4.0 is of the python type  <class 'float'>\n\n"
 num 9

Execute Python code

Run python string:

py_run_string("result = [1,2,3]*2")
py$result
[1] 1 2 3 1 2 3

All objects created by python are accessible using the py object exported by reticulate

Execute Python code

Run python script my_python_script.py:

def add(x, y):
  return x + y

def multiply_by_3(x):
  return x*3

def run_all():
  x = 5
  y = 8
  added = add(x, y)
  final = multiply_by_3(added)
  return final

final = run_all()
py_run_file("my_python_script.py")
py$final
[1] 39

Python in R Markdown

In R Markdown, it is possible to mix in Python chunks:

```{python}
import pandas as pd

movies = get_all_movies()
print(type(movies))
```
<class 'pandas.core.frame.DataFrame'>

Python in R Markdown

Access the movie object using the py object, which will convert movies to an R object:

movies_r <- py$movies
movies_r <- as_tibble(movies_r)
subset <- movies_r %>% select(5:6, 8:10)

Python in R Markdown

Access the movie object using the py object, which will convert movies to an R object:

movies_r <- py$movies
movies_r <- as_tibble(movies_r)
subset <- movies_r %>% select(5:6, 8:10)
knitr::kable(subset[1:7,],'html')
originalTitle startYear runtimeMinutes genres averageRating
Kate & Leopold 2001 118 Comedy,Fantasy,Romance 6.4
The Brain That Wouldn't Die 1962 82 Horror,Sci-Fi 4.4
The Fugitive Kind 1960 119 Drama,Romance 7.1
Les yeux sans visage 1960 90 Drama,Horror 7.7
À bout de souffle 1960 90 Crime,Drama 7.8
13 Ghosts 1960 85 Horror,Mystery 6.1
The Alamo 1960 162 Adventure,Drama,History 6.8

Python in R Markdown

Continue working with the now converted R object in R

ggplot(movies_r, aes(x=startYear)) + geom_bar() + 
                                     theme(axis.text.x = element_text(angle = 90)) +
                                     ggtitle('Number of movies per year')

Python in R Markdown

Continue working with the now converted R object in R

ggplot(movies_r, aes(x=startYear)) + geom_bar() + 
                                     theme(axis.text.x = element_text(angle = 90)) +
                                     ggtitle('Number of movies per year')

Type conversions

When calling python code from R, R data types are converted to Python types, and vice versa, when values are returned from Python to R they are converted back to R types.

Conversion table

R Python Examples
Single-element vector Scalar 1 , 1L , TRUE, foo
Multi-element vector List c(1.0, 2.0, 3.0), c(1L, 2L, 3L)
List of multiple types Tuple list(1L, TRUE, "foo")
Named list Dict list(a = 1L, b = 2.0), dict(x = x_data)
Matrix/Array NumPy ndarray matrix(c(1,2,3,4), nrow=2, ncol=2)
Data Frame Pandas DataFrame data.frame(x = c(1,2,3), y = c("a","b","c"))
Function Python function function(x) x +1
Raw Python bytearray as.raw(c(1:10))
NULL, TRUE, FALSE None, True, False NULL, TRUE, FALSE

Type conversions

python_functions.py:

def check_python_type(x):
  print(type(x))
  return x
source_python("python_functions.py")

r_var <- matrix(c(1,2,3,4),nrow=2, ncol=2)
class(r_var)
py_capture_output(check_python_type(r_var))
r_var2 <- check_python_type(r_var)
class(r_var2)
[1] "matrix" "array" 
[1] "<class 'numpy.ndarray'>\n\n"
[1] "matrix" "array" 

Type conversions

source_python("python_functions.py", convert=FALSE)

r_var <- matrix(c(1,2,3,4),nrow=2, ncol=2)
class(r_var)
py_capture_output(check_python_type(r_var))
r_var2 <- check_python_type(r_var)
class(r_var2)
r_var3 <- py_to_r(r_var2)
class(r_var3)
[1] "matrix" "array" 
[1] "<class 'numpy.ndarray'>\n\n"
[1] "numpy.ndarray"         "python.builtin.object"
[1] "matrix" "array" 

Type conversions

  • 42 in R is a floating point number. In Python it is an integer
str(42)
check_python_type(42)
py_capture_output(check_python_type(42))
 num 42
42.0
[1] "<class 'float'>\n\n"
str(42L)
check_python_type(42L)
py_capture_output(check_python_type(42L))
 int 42
42
[1] "<class 'int'>\n\n"

Type conversions

  • List conversions of single element vectors, automatically translated to Python scalar
str(c(24))
check_python_type(c(24))
py_capture_output(check_python_type(c(24)))
 num 24
24.0
[1] "<class 'float'>\n\n"
str(list(24))
check_python_type(list(24))
py_capture_output(check_python_type(list(24)))
List of 1
 $ : num 24
[24.0]
[1] "<class 'list'>\n\n"

Thank you! Questions?

2023 • SciLifeLabNBISRaukR