class: center, middle, inverse, title-slide .title[ # Introduction to R ] .subtitle[ ## Workshop on Data Visualization in R ] .author[ ###
Lokesh Mano
• 02-Feb-2024 ] .institute[ ### NBIS, SciLifeLab ] --- exclude: true count: false <link href="https://fonts.googleapis.com/css?family=Roboto|Source+Sans+Pro:300,400,600|Ubuntu+Mono&subset=latin-ext" rel="stylesheet"> <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.3.1/css/all.css" integrity="sha384-mzrmE5qonljUremFsqc01SB46JvROS7bZs3IO2EmfFsd15uHvIt+Y8vEf7N7fWAU" crossorigin="anonymous"> --- name: content class: spaced ## Contents * [Course and webpage](#demo) * [Overview of R](#r-intro) * [Data formats](#data) * [Data frames](#data-frame) * [Important functions](#func) * [Tips](#tips) --- name: demo ## Quick checkups .pull-center[ <img src="assets/images/inst_check.png" alt="drawing" width="500"/> ] -- * Coffe breaks (15 minutes is fine?) * Webpage structure * Plots from drop-down * Times mentioned in schedule are **super** arbitrary --- name: r-intro ## R * Derived from a statistical programming language called **S** * You can write your own functions * Powerful and flexible. * Available for all platforms -- * `GUI` with **Rstudio** -- * **RMarkdown**: Embedding codes and results together -- .pull-center[ <img src="assets/images/free.png" alt="drawing" width="300"/> ] --- name: data ## Data Formats -- - Wide format <table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;font-weight: bold;color: blue !important;"> </th> <th style="text-align:right;font-weight: bold;color: blue !important;"> Sample_1 </th> <th style="text-align:right;font-weight: bold;color: blue !important;"> Sample_2 </th> <th style="text-align:right;font-weight: bold;color: blue !important;"> Sample_3 </th> <th style="text-align:right;font-weight: bold;color: blue !important;"> Sample_4 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000003 </td> <td style="text-align:right;color: orange !important;"> 321 </td> <td style="text-align:right;color: orange !important;"> 303 </td> <td style="text-align:right;color: orange !important;"> 204 </td> <td style="text-align:right;color: orange !important;"> 492 </td> </tr> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000005 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000419 </td> <td style="text-align:right;color: orange !important;"> 696 </td> <td style="text-align:right;color: orange !important;"> 660 </td> <td style="text-align:right;color: orange !important;"> 472 </td> <td style="text-align:right;color: orange !important;"> 951 </td> </tr> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000457 </td> <td style="text-align:right;color: orange !important;"> 59 </td> <td style="text-align:right;color: orange !important;"> 54 </td> <td style="text-align:right;color: orange !important;"> 44 </td> <td style="text-align:right;color: orange !important;"> 109 </td> </tr> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000460 </td> <td style="text-align:right;color: orange !important;"> 399 </td> <td style="text-align:right;color: orange !important;"> 405 </td> <td style="text-align:right;color: orange !important;"> 236 </td> <td style="text-align:right;color: orange !important;"> 445 </td> </tr> <tr> <td style="text-align:left;color: orange !important;color: red !important;"> ENSG00000000938 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> </tbody> </table> -- * familiarity * conveniency * you see more data --- name: data-2 ## Data Formats - Long format -- <table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Sample_ID </th> <th style="text-align:left;"> Gene </th> <th style="text-align:right;"> count </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000003 </td> <td style="text-align:right;color: orange !important;"> 321 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000005 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000419 </td> <td style="text-align:right;color: orange !important;"> 696 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000457 </td> <td style="text-align:right;color: orange !important;"> 59 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000460 </td> <td style="text-align:right;color: orange !important;"> 399 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: red !important;"> ENSG00000000938 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> </tbody> </table> -- <table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Sample_ID </th> <th style="text-align:left;"> Sample_Name </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Replicate </th> <th style="text-align:left;"> Cell </th> <th style="text-align:left;"> Gene </th> <th style="text-align:right;"> count </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000003 </td> <td style="text-align:right;color: orange !important;"> 321 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000005 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000419 </td> <td style="text-align:right;color: orange !important;"> 696 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000457 </td> <td style="text-align:right;color: orange !important;"> 59 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000460 </td> <td style="text-align:right;color: orange !important;"> 399 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000938 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> </tbody> </table> --- name: data-3 ## Data Formats - Long format <table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Sample_ID </th> <th style="text-align:left;"> Sample_Name </th> <th style="text-align:left;"> Time </th> <th style="text-align:left;"> Replicate </th> <th style="text-align:left;"> Cell </th> <th style="text-align:left;"> Gene </th> <th style="text-align:right;"> count </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000003 </td> <td style="text-align:right;color: orange !important;"> 321 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000005 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000419 </td> <td style="text-align:right;color: orange !important;"> 696 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000457 </td> <td style="text-align:right;color: orange !important;"> 59 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000460 </td> <td style="text-align:right;color: orange !important;"> 399 </td> </tr> <tr> <td style="text-align:left;color: blue !important;"> Sample_1 </td> <td style="text-align:left;color: blue !important;"> t0_A </td> <td style="text-align:left;color: blue !important;"> t0 </td> <td style="text-align:left;color: blue !important;"> A </td> <td style="text-align:left;color: blue !important;"> A431 </td> <td style="text-align:left;color: red !important;"> ENSG00000000938 </td> <td style="text-align:right;color: orange !important;"> 0 </td> </tr> </tbody> </table> -- * easier to add data to the existing * Most databases store and maintain in long-formats due to its efficiency * R tools **like ggplot** require data in long format. --- name: data-frame ## Data Frames - Let us take a quick look into `data.frame` in `R`: .pull-center[ <img src="assets/images/df.png" alt="drawing" width="600"/> ] * imported files re usually in `data.frame` * Structured matrix with `row.names` and `colnames` * Probably most used `data.type` in Biology! --- name: func ## Vectors ```r n <- c(2,3,4,2,1,2,4,5,10,11,8,9) print(n) ``` ``` ## [1] 2 3 4 2 1 2 4 5 10 11 8 9 ``` -- ```r z <- n +3 print(z) ``` ``` ## [1] 5 6 7 5 4 5 7 8 13 14 11 12 ``` -- ```r z <- n +3 mean(z) ``` ``` ## [1] 8.083333 ``` -- ```r s <- c("I", "love", "Batman") print(s) ``` ``` ## [1] "I" "love" "Batman" ``` --- name: vec-typ ## Vector types * `int` stands for *integers* * `dbl` stands for *doubles* or real numbers * `chr` stands for *character* vectors or strings * `dttm` stands for *date and time*, * `lgl` stands for *logical* with just TRUE or FALSE * `fctr` stands for *factors* which R uses to state categorical variables. * `date` stands for *dates* You can find what kind of vectors you have or imported by using the function `class()` --- name:tips ## Important tips - `?` and `??` * `?` help manual for a particular function * `??` searches the entire `R` library for the term * `vignette("ggplot2")` -- - TAB completion * Probably most useful to avoid unnecessary error messages (and/or frustration)! -- - Case sensitive ```r print(N) ``` ``` ## Error in print(N): object 'N' not found ``` ```r print(n) ``` ``` ## [1] 2 3 4 2 1 2 4 5 10 11 8 9 ``` --- name: end_slide class: end-slide, middle count: false # Thank you. Questions? .end-text[ <p>R version 4.1.3 (2022-03-10)<br><p>Platform: x86_64-pc-linux-gnu (64-bit)</p><p>OS: Ubuntu 22.04.3 LTS</p><br> <hr> <span class="small">Built on : <i class='fa fa-calendar' aria-hidden='true'></i> 02-Feb-2024 at <i class='fa fa-clock-o' aria-hidden='true'></i> 11:06:41</span> <b>2024</b> • [SciLifeLab](https://www.scilifelab.se/) • [NBIS](https://nbis.se/) ]