Reticulate
RaukR 2023 • Advanced R for Bioinformatics
27-Jun-2023
Learning outcomes
In this session we will learn to:
Understand the concepts needed for running Python in R
Understand the different object classes in Python and their equivalent in R
Apply our knowledge to:
Import Python functions into R
Use R objects as input to Python functions
Translate between Python and R objects
Introduction
R versus Python The ultimate fight!
Introducing reticulate
Combine Python and R code
Use R classes in Python functions and vice versa
Import Python functions into R code and run from R
Add Python code chunks to markdown documents
Importing Python modules
datetime <- import ("datetime" )
todays_r_date <- datetime$ datetime$ now ()
todays_r_date
class (todays_r_date)
[1] "2023-08-04 11:00:22 UTC"
[1] "POSIXct" "POSIXt"
Objects are automatically converted to R types, unless otherwise specified
datetime <- import ("datetime" , convert = FALSE )
todays_py_date <- datetime$ datetime$ now ()
todays_py_date
class (todays_py_date)
datetime.datetime(2023, 8, 4, 11, 0, 22, 351864)
[1] "datetime.datetime" "datetime.date" "python.builtin.object"
Importing built-in Python functions
Access Python’s built-in functions directly in R
builtins <- import_builtins ()
r_vec <- c (1 , 5 , 3 , 4 , 2 , 2 , 3 , 2 )
str (r_vec)
num [1:8] 1 5 3 4 2 2 3 2
r_vec is an R object.
builtins$ len (r_vec); builtins$ max (r_vec)
Python built-in functions still working on R objects
Sourcing scripts
Import your own python functions for use in R. File python_functions.py:
def add(x, y):
return x + y
R code:
source_python ("python_functions.py" )
class (4 )
res <- add (4 ,5 )
res
class (res)
[1] "numeric"
[1] 9
[1] "numeric"
Type numeric in and type numeric out. But what happens in between?
Sourcing scripts
But what happens in between?
File python_functions.py:
def add_with_print(x, y):
print (x, 'is of the python type ' , type (x))
return x + y
res2 <- add_with_print (4 ,5 )
py_capture_output (add_with_print (4 ,5 ))
str (res2)
[1] "4.0 is of the python type <class 'float'>\n\n"
num 9
Execute Python code
Run python string:
py_run_string ("result = [1,2,3]*2" )
py$ result
All objects created by python are accessible using the py object exported by reticulate
Execute Python code
Run python script my_python_script.py:
def add(x, y):
return x + y
def multiply_by_3(x):
return x* 3
def run_all():
x = 5
y = 8
added = add(x, y)
final = multiply_by_3(added)
return final
final = run_all()
py_run_file ("my_python_script.py" )
py$ final
Python in R Markdown
In R Markdown, it is possible to mix in Python chunks:
```{python}
import pandas as pd
movies = get_all_movies()
print (type (movies))
```
<class 'pandas.core.frame.DataFrame'>
Python in R Markdown
Access the movie object using the py object, which will convert movies to an R object:
movies_r <- py$ movies
movies_r <- as_tibble (movies_r)
subset <- movies_r %>% select (5 : 6 , 8 : 10 )
Python in R Markdown
Access the movie object using the py object, which will convert movies to an R object:
movies_r <- py$ movies
movies_r <- as_tibble (movies_r)
subset <- movies_r %>% select (5 : 6 , 8 : 10 )
knitr:: kable (subset[1 : 7 ,],'html' )
Kate & Leopold
2001
118
Comedy,Fantasy,Romance
6.4
The Brain That Wouldn't Die
1962
82
Horror,Sci-Fi
4.4
The Fugitive Kind
1960
119
Drama,Romance
7.1
Les yeux sans visage
1960
90
Drama,Horror
7.7
À bout de souffle
1960
90
Crime,Drama
7.8
13 Ghosts
1960
85
Horror,Mystery
6.1
The Alamo
1960
162
Adventure,Drama,History
6.8
Python in R Markdown
Continue working with the now converted R object in R
ggplot (movies_r, aes (x= startYear)) + geom_bar () +
theme (axis.text.x = element_text (angle = 90 )) +
ggtitle ('Number of movies per year' )
Python in R Markdown
Continue working with the now converted R object in R
ggplot (movies_r, aes (x= startYear)) + geom_bar () +
theme (axis.text.x = element_text (angle = 90 )) +
ggtitle ('Number of movies per year' )
Type conversions
When calling python code from R, R data types are converted to Python types, and vice versa, when values are returned from Python to R they are converted back to R types.
Conversion table
Single-element vector
Scalar
1 , 1L , TRUE, foo
Multi-element vector
List
c(1.0, 2.0, 3.0), c(1L, 2L, 3L)
List of multiple types
Tuple
list(1L, TRUE, "foo")
Named list
Dict
list(a = 1L, b = 2.0), dict(x = x_data)
Matrix/Array
NumPy ndarray
matrix(c(1,2,3,4), nrow=2, ncol=2)
Data Frame
Pandas DataFrame
data.frame(x = c(1,2,3), y = c("a","b","c"))
Function
Python function
function(x) x +1
Raw
Python bytearray
as.raw(c(1:10))
NULL, TRUE, FALSE
None, True, False
NULL, TRUE, FALSE
Type conversions
python_functions.py:
def check_python_type(x):
print (type (x))
return x
source_python ("python_functions.py" )
r_var <- matrix (c (1 ,2 ,3 ,4 ),nrow= 2 , ncol= 2 )
class (r_var)
py_capture_output (check_python_type (r_var))
r_var2 <- check_python_type (r_var)
class (r_var2)
[1] "matrix" "array"
[1] "<class 'numpy.ndarray'>\n\n"
[1] "matrix" "array"
Type conversions
source_python ("python_functions.py" , convert= FALSE )
r_var <- matrix (c (1 ,2 ,3 ,4 ),nrow= 2 , ncol= 2 )
class (r_var)
py_capture_output (check_python_type (r_var))
r_var2 <- check_python_type (r_var)
class (r_var2)
r_var3 <- py_to_r (r_var2)
class (r_var3)
[1] "matrix" "array"
[1] "<class 'numpy.ndarray'>\n\n"
[1] "numpy.ndarray" "python.builtin.object"
[1] "matrix" "array"
Type conversions
42 in R is a floating point number. In Python it is an integer
str (42 )
check_python_type (42 )
py_capture_output (check_python_type (42 ))
num 42
42.0
[1] "<class 'float'>\n\n"
str (42L)
check_python_type (42L)
py_capture_output (check_python_type (42L))
int 42
42
[1] "<class 'int'>\n\n"
Type conversions
List conversions of single element vectors, automatically translated to Python scalar
str (c (24 ))
check_python_type (c (24 ))
py_capture_output (check_python_type (c (24 )))
num 24
24.0
[1] "<class 'float'>\n\n"
str (list (24 ))
check_python_type (list (24 ))
py_capture_output (check_python_type (list (24 )))
List of 1
$ : num 24
[24.0]
[1] "<class 'list'>\n\n"