RaukR 2024 • Advanced R for Bioinformatics
Nina Norgren
21-Jun-2024
In this session we will learn to:
R versus Python The ultimate fight!
Not anymore!
Objects are automatically converted to R types, unless otherwise specified
Access Python’s built-in functions directly in R
num [1:8] 1 5 3 4 2 2 3 2
r_vec
is an R object.
Python built-in functions still working on R objects
Import your own python functions for use in R. File python_functions.py
:
Import your own python functions for use in R.
R code:
Import your own python functions for use in R.
R code:
Type numeric
in and type numeric
out. But what happens in between?
But what happens in between?
File python_functions.py
:
Run python string:
All objects created by python are accessible using the py
object exported by reticulate
Run python script my_python_script.py
:
In R Markdown, it is possible to mix in Python chunks:
<class 'pandas.core.frame.DataFrame'>
Access the movie object using the py
object, which will convert movies to an R object:
Access the movie object using the py
object, which will convert movies to an R object:
movies_r <- py$movies
movies_r <- as_tibble(movies_r)
subset <- movies_r %>% select(5:6, 8:10)
knitr::kable(subset[1:7,],'html')
originalTitle | startYear | runtimeMinutes | genres | averageRating |
---|---|---|---|---|
Kate & Leopold | 2001 | 118 | Comedy,Fantasy,Romance | 6.4 |
The Brain That Wouldn't Die | 1962 | 82 | Horror,Sci-Fi | 4.4 |
The Fugitive Kind | 1960 | 119 | Drama,Romance | 7.1 |
Les yeux sans visage | 1960 | 90 | Drama,Horror | 7.7 |
À bout de souffle | 1960 | 90 | Crime,Drama | 7.8 |
13 Ghosts | 1960 | 85 | Horror,Mystery | 6.1 |
The Alamo | 1960 | 162 | Adventure,Drama,History | 6.8 |
Continue working with the now converted R object in R
Continue working with the now converted R object in R
When calling python code from R, R data types are converted to Python types, and vice versa, when values are returned from Python to R they are converted back to R types.
R | Python | Examples |
---|---|---|
Single-element vector | Scalar | 1 , 1L , TRUE, foo |
Multi-element vector | List | c(1.0, 2.0, 3.0), c(1L, 2L, 3L) |
List of multiple types | Tuple | list(1L, TRUE, "foo") |
Named list | Dict | list(a = 1L, b = 2.0), dict(x = x_data) |
Matrix/Array | NumPy ndarray | matrix(c(1,2,3,4), nrow=2, ncol=2) |
Data Frame | Pandas DataFrame | data.frame(x = c(1,2,3), y = c("a","b","c")) |
Function | Python function | function(x) x +1 |
Raw | Python bytearray | as.raw(c(1:10)) |
NULL, TRUE, FALSE | None, True, False | NULL, TRUE, FALSE |
python_functions.py
:
source_python("python_functions.py", convert=FALSE)
r_var <- matrix(c(1,2,3,4),nrow=2, ncol=2)
class(r_var)
r_var2 <- check_python_type(r_var)
class(r_var2)
r_var3 <- py_to_r(r_var2)
class(r_var3)
[1] "matrix" "array"
<class 'numpy.ndarray'>
[1] "numpy.ndarray" "python.builtin.object"
[1] "matrix" "array"
42
in R is a floating point number. In Python it is an integer# Import scikit-learn's random forest classifier
sklearn <- import("sklearn.ensemble")
RandomForestClassifier <- sklearn$RandomForestClassifier
# Create a random forest classifier
clf <- RandomForestClassifier(n_estimators=100L)
# Training data (example)
X_train <- matrix(runif(1000), ncol=10)
y_train <- sample(c(0, 1), 100, replace=TRUE)
# Train the model
clf$fit(X_train, y_train)
# Predict on new data
X_test <- matrix(runif(200), ncol=10)
predictions <- clf$predict(X_test)
predictions
RandomForestClassifier()
[1] 0 0 1 1 1 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1
# Load the ensembl_rest library
ensembl_rest <- import("ensembl_rest")
# Fetch gene information for a given gene ID
gene_info <- ensembl_rest$symbol_lookup(species='homo sapiens', symbol='BRCA2')
# Print gene information
gene_info$description
[1] "BRCA2 DNA repair associated [Source:HGNC Symbol;Acc:HGNC:1101]"
Text:
Hugging Face: Revolutionizing Natural Language Processing
In the rapidly evolving field of Natural Language Processing (NLP), Hugging Face has emerged as a prominent and innovative force. This article will explore the story and significance of Hugging Face, a company that has made remarkable contributions to NLP and AI as a whole. From its inception to its role in democratizing AI, Hugging Face has left an indelible mark on the industry.
Transformative Innovations Hugging Face is best known for its open-source contributions, particularly the “Transformers” library. This library has become the de facto standard for NLP and enables researchers, developers, and organizations to easily access and utilize state-of-the-art pre-trained language models, such as BERT, GPT-3, and more. These models have countless applications, from chatbots and virtual assistants to language translation and sentiment analysis.
# Example text to summarize
text <- "Hugging Face: Revolutionizing Natural Language Processing. In the rapidly evolving field of Natural Language Processing (NLP), Hugging Face has emerged as a prominent and innovative force. This article will explore the story and significance of Hugging Face, a company that has made remarkable contributions to NLP and AI as a whole. From its inception to its role in democratizing AI, Hugging Face has left an indelible mark on the industry. Transformative Innovations: Hugging Face is best known for its open-source contributions, particularly the Transformers library. This library has become the de facto standard for NLP and enables researchers, developers, and organizations to easily access and utilize state-of-the-art pre-trained language models, such as BERT, GPT-3, and more. These models have countless applications, from chatbots and virtual assistants to language translation and sentiment analysis. 1. Transformers Library: The Transformers library provides a unified interface for more than 50 pre-trained models, simplifying the development of NLP applications. It allows users to fine-tune these models for specific tasks, making it accessible to a wider audience. 2. Model Hub: Hugging Face's Model Hub is a treasure trove of pre-trained models, making it simple for anyone to access, experiment with, and fine-tune models. Researchers and developers around the world can collaborate and share their models through this platform. 3. Hugging Face Transformers Community: Hugging Face has fostered a vibrant online community where developers, researchers, and AI enthusiasts can share their knowledge, code, and insights. This collaborative spirit has accelerated the growth of NLP."
# Perform summarization
summary <- summarizer(text, max_length=100L, min_length=10L)
summary
[[1]]
[[1]]$summary_text
[1] "Hugging Face: Revolutionizing Natural Language Processing (NLP) has emerged as a prominent and innovative force . This article will explore the story and significance of Hugging face . The Transformers library has become the de facto standard for NLP and enables researchers, developers, and organizations to easily access and utilize state-of-the-art pre-trained language models ."
2024 • SciLifeLab • NBIS • RaukR