This is the accompaning exercise to the plotly lecture. We will for the whole exercise use a very simple data set of daily temperature recordings over the last 295 years in Uppsala. There are two main parts to the exercise. One where we make and modify simple plots. All codes to generate these plots can be found on this page, but as with the other exercises they are hidden by default. Click on the ‘Code’ button on the right side to show the code, but make sure to try and solve the problems by having a look at the lecture material or visit https://plot.ly/r/ for a more comprehensive overview.

The second part will be more of a challenge where you get a chance to ponder about how to best present the data using the plotly functionality. The main goal here is to generate a nice looking animated plotly graphic. NB! For this part there is no solution offered instead we encourage you to check out the code example found at https://plotly-book.cpsievert.me that have examples of time series data and how to generate nice graphics. We encourage to try and alter parameters and try different kind of plots and make sure colors, symbol and marker sizes etc is the way you prefer.


1 Introduction to plotly

The plotly R package is open source and can be used to generate any kind of plots after installed. There is however some functionality like easy sharing of both plots and the underlying objects that is more easily done if you create an account at plotly. If you want to create an account go to https://plot.ly/accounts/login/. Once you recieve the confirmation email, make sure to validate your account and follow the instructions to pass your username and API key to R.

Once this is done you can with the free tier post graphics to to plot.ly and share the plots and associated data to other users that can can view, modify and download plots and data.

2 Temperature data set

In this exercise a time series with daily temperature measures from Uppsala will be used to in more depth try a a few different ways of visualising data and to discover some of the types of modifications that is accessible when using the plotly library.

The data can be downloaded from the Swedish Meteorological and Hydrological Institute (SMHI) at this page http://www.smhi.se/klimatdata/meteorologi/temperatur/uppsalas-temperaturserie-1.2855 (only in swedish, but you just need to download the data to your computer). Unzip the file and in the text file associated with the data you can find a description (in english) of the data.

2.1 Import data and arrange data

Import the data into R.

uppsala_t <- read.table("~/Downloads/uppsala_tm_1722-2017/uppsala_tm_1722-2017.dat")
head(uppsala_t)

2.2 Sanity check

Add column names and make sure the content is what you expect based on the information in the metadata

colnames(uppsala_t) <- c("year", "month","day", "temp", "tempcorr", "place")
str(uppsala_t)
table(uppsala_t$place)
table(uppsala_t$month)
## 'data.frame':    108101 obs. of  6 variables:
##  $ year    : int  1722 1722 1722 1722 1722 1722 1722 1722 1722 1722 ...
##  $ month   : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ day     : int  12 13 14 15 16 17 18 19 20 21 ...
##  $ temp    : num  1.9 2.3 1.8 0.9 -1.8 0.5 0.1 -1.8 0.5 1.8 ...
##  $ tempcorr: num  1.8 2.2 1.7 0.8 -1.9 0.4 0 -1.9 0.4 1.6 ...
##  $ place   : int  1 1 1 1 1 1 1 1 1 1 ...
## 
##      1      2      3      4      5      6 
## 102942   2325     90    470   2122    152 
## 
##    1    2    3    4    5    6    7    8    9   10   11   12 
## 9165 8360 9176 8880 9176 8880 9176 9176 8880 9176 8880 9176

As you can see from the table output we now need to make a decision on how to proceed with the data. Even though the data name suggests that it all comes from Uppsala there are some entries for other places and also a set of values that are interpolated based on other data.

3 A first look at the data

Before we take a decision on weather to filter or retain data, lets have a look at the distributions of both temperature and corrected temperature in a histogram. A histogram is an fast an effecient way to look at a data set. So lets start by creating a plotly histogram for both the measured and corrected temperatures after loading the libraries we need.

# data handling
library(dplyr)
library(tidyr)
library(ggplot2)
library(plotly)

htemp <- plot_ly(uppsala_t, x = ~temp, type = "histogram")
hcorr <-plot_ly(uppsala_t, x = ~tempcorr, type = "histogram")
subplot(htemp, hcorr)

With the default settings it looks okay, but it can be improved by adding better names, change colors and remove the y-axis from on the second plot. In addition we can filter the non-uppsala entries to see weather there is any impact on the general distribution.

htemp <- plot_ly(uppsala_t[uppsala_t$place==1,], x = ~temp, type = "histogram", color = I("darkblue"), name = "Temperature") 
hcorr <- plot_ly(uppsala_t[uppsala_t$place ==1,], x = ~temp, type = "histogram", color = I("tomato3"), name = "Corrected temperature") %>%
  layout(yaxis = list(title = "", zeroline = FALSE, showline = FALSE, showticklabels = FALSE, showgrid = TRUE
))
subplot(htemp, hcorr)

The distribution looks sort of what one would expect from this part of sweden and there is, as one might suspect given that there were less than 5000 out the more than 100000 entries that were from other localities than uppsala, in essence no reason to filter these.

3.0.1 Boxplot

Another way to visualise data is to create a boxplots. Lets keep the same colors as for the histograms and add a nice title to the plot. Note that with long names we need to alter the left margin in the boxplot or the name will be cut off. This is most easily done with the layout() function. Many of the arguments, like the margin argument, in the layout() accepts lists of values so one needs to make sure that one pass a list to the margin argument.

plot_ly(uppsala_t) %>%
  add_boxplot(~temp, name = "Measured temperature") %>%
  add_boxplot(~tempcorr, name = "Corrected temperature") %>%
  layout(title = "Daily temperature in Uppsala 1722-2017", margin = list(l = 150))

3.0.2 Boxplot continued

A boxplot of the complete data is however not that informative, since it tells us nothing about patterns over years nor differences between years. To look at the former we can create a boxplot for every month.

plot_ly(uppsala_t, y = ~temp, x = ~factor(month), type = "box")

3.0.3 Boxplot continued

The axis names in former plot could be better to describe the actual data. One can also consider changing the month number to name, but lets skip that for now (or even better try to do this modification on your own)

xax <- list(type = "category",
      title = "Month")
yax <- list(title = "Temperature")

plot_ly(uppsala_t, y = ~temp, x = ~factor(month), type = "box")%>% 
  layout(xaxis = xax, yaxis = yax)

3.0.4 Line plot with years highlighted

First we create a new data set retaining a single mean temperature for every month.

ut <- uppsala_t%>%
  group_by(year,month) %>%
  summarise_at(vars(temp), funs(mean(., na.rm=TRUE)))

Then we can plot the average per month and highlight a year by adding a line in a different color.

p <- plot_ly(ut, x = ~month, y = ~temp)
  add_lines(
    add_lines(p, alpha = 0.2, name = "1722-2017", hoverinfo = "none", color = I("grey")),
    name = "1722", data = filter(ut, year == 1722))%>%
    add_lines(name = "2017", data = filter(ut, year == 2017), color = I("tomato3"))

4 Animated graphics

For the last part try and create a useful and nice looking graphical animation of the temperature data. Make sure that the axis names reflect the data and add a nice title to the plot. To really add your signature to the animation pick a splashing color (or multiple colors).

5 Session Info

  • This document has been created in RStudio using R Markdown and related packages.
  • For R Markdown, see http://rmarkdown.rstudio.com
  • For details about the OS, packages and versions, see detailed information below:
sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] bindrcpp_0.2.2      crosstalk_1.0.0     leaflet_2.0.0      
##  [4] networkD3_0.4       dygraphs_1.1.1.4    ggiraph_0.4.2      
##  [7] plotly_4.7.1        highcharter_0.5.0   DT_0.4             
## [10] formattable_0.2.0.1 kableExtra_0.9.0    forcats_0.3.0      
## [13] stringr_1.3.1       dplyr_0.7.5         purrr_0.2.5        
## [16] readr_1.1.1         tidyr_0.8.1         tibble_1.4.2       
## [19] ggplot2_2.2.1.9000  tidyverse_1.2.1     captioner_2.2.3    
## [22] bookdown_0.7        knitr_1.20         
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-137      xts_0.10-2        lubridate_1.7.4  
##  [4] httr_1.3.1        rprojroot_1.3-2   tools_3.5.0      
##  [7] backports_1.1.2   R6_2.2.2          lazyeval_0.2.1   
## [10] colorspace_1.3-2  withr_2.1.2       tidyselect_0.2.4 
## [13] mnormt_1.5-5      curl_3.2          compiler_3.5.0   
## [16] cli_1.0.0         rvest_0.3.2       xml2_1.2.0       
## [19] officer_0.3.0     scales_0.5.0      psych_1.8.4      
## [22] digest_0.6.15     foreign_0.8-70    rmarkdown_1.9    
## [25] R.utils_2.6.0     base64enc_0.1-3   pkgconfig_2.0.1  
## [28] htmltools_0.3.6   rvg_0.1.8         htmlwidgets_1.2  
## [31] rlang_0.2.1       readxl_1.1.0      TTR_0.23-3       
## [34] rstudioapi_0.7    quantmod_0.4-13   shiny_1.1.0      
## [37] bindr_0.1.1       zoo_1.8-1         jsonlite_1.5     
## [40] zip_1.0.0         R.oo_1.22.0       magrittr_1.5     
## [43] rlist_0.4.6.1     Rcpp_0.12.17      munsell_0.4.3    
## [46] gdtools_0.1.7     R.methodsS3_1.7.1 stringi_1.2.2    
## [49] yaml_2.1.19       plyr_1.8.4        grid_3.5.0       
## [52] promises_1.0.1    parallel_3.5.0    crayon_1.3.4     
## [55] lattice_0.20-35   haven_1.1.1       hms_0.4.2        
## [58] pillar_1.2.3      igraph_1.2.1      uuid_0.1-2       
## [61] reshape2_1.4.3    glue_1.2.0        evaluate_0.10.1  
## [64] data.table_1.11.4 modelr_0.1.2      httpuv_1.4.3     
## [67] cellranger_1.1.0  gtable_0.2.0      assertthat_0.2.0 
## [70] xfun_0.1          mime_0.5          xtable_1.8-2     
## [73] broom_0.4.4       later_0.7.2       viridisLite_0.3.0

Page built on: 10-Jun-2018 at 07:56:36.


2018 | SciLifeLab > NBIS > RaukR website twitter