This is the accompaning exercise to the plotly lecture. We will for the whole exercise use a very simple data set of daily temperature recordings over the last 295 years in Uppsala. There are two main parts to the exercise. One where we make and modify simple plots. All codes to generate these plots can be found on this page, but as with the other exercises they are hidden by default. Click on the ‘Code’ button on the right side to show the code, but make sure to try and solve the problems by having a look at the lecture material or visit https://plot.ly/r/ for a more comprehensive overview.
The second part will be more of a challenge where you get a chance to ponder about how to best present the data using the plotly functionality. The main goal here is to generate a nice looking animated plotly graphic. NB! For this part there is no solution offered instead we encourage you to check out the code example found at https://plotly-book.cpsievert.me that have examples of time series data and how to generate nice graphics. We encourage to try and alter parameters and try different kind of plots and make sure colors, symbol and marker sizes etc is the way you prefer.The plotly R package is open source and can be used to generate any kind of plots after installed. There is however some functionality like easy sharing of both plots and the underlying objects that is more easily done if you create an account at plotly. If you want to create an account go to https://plot.ly/accounts/login/. Once you recieve the confirmation email, make sure to validate your account and follow the instructions to pass your username and API key to R.
Once this is done you can with the free tier post graphics to to plot.ly and share the plots and associated data to other users that can can view, modify and download plots and data.
In this exercise a time series with daily temperature measures from Uppsala will be used to in more depth try a a few different ways of visualising data and to discover some of the types of modifications that is accessible when using the plotly library.
The data can be downloaded from the Swedish Meteorological and Hydrological Institute (SMHI) at this page http://www.smhi.se/klimatdata/meteorologi/temperatur/uppsalas-temperaturserie-1.2855 (only in swedish, but you just need to download the data to your computer). Unzip the file and in the text file associated with the data you can find a description (in english) of the data.
Import the data into R.
uppsala_t <- read.table("~/Downloads/uppsala_tm_1722-2017/uppsala_tm_1722-2017.dat")
head(uppsala_t)
Add column names and make sure the content is what you expect based on the information in the metadata
colnames(uppsala_t) <- c("year", "month","day", "temp", "tempcorr", "place")
str(uppsala_t)
table(uppsala_t$place)
table(uppsala_t$month)
## 'data.frame': 108101 obs. of 6 variables:
## $ year : int 1722 1722 1722 1722 1722 1722 1722 1722 1722 1722 ...
## $ month : int 1 1 1 1 1 1 1 1 1 1 ...
## $ day : int 12 13 14 15 16 17 18 19 20 21 ...
## $ temp : num 1.9 2.3 1.8 0.9 -1.8 0.5 0.1 -1.8 0.5 1.8 ...
## $ tempcorr: num 1.8 2.2 1.7 0.8 -1.9 0.4 0 -1.9 0.4 1.6 ...
## $ place : int 1 1 1 1 1 1 1 1 1 1 ...
##
## 1 2 3 4 5 6
## 102942 2325 90 470 2122 152
##
## 1 2 3 4 5 6 7 8 9 10 11 12
## 9165 8360 9176 8880 9176 8880 9176 9176 8880 9176 8880 9176
As you can see from the table output we now need to make a decision on how to proceed with the data. Even though the data name suggests that it all comes from Uppsala there are some entries for other places and also a set of values that are interpolated based on other data.
Before we take a decision on weather to filter or retain data, lets have a look at the distributions of both temperature and corrected temperature in a histogram. A histogram is an fast an effecient way to look at a data set. So lets start by creating a plotly histogram for both the measured and corrected temperatures after loading the libraries we need.
# data handling
library(dplyr)
library(tidyr)
library(ggplot2)
library(plotly)
htemp <- plot_ly(uppsala_t, x = ~temp, type = "histogram")
hcorr <-plot_ly(uppsala_t, x = ~tempcorr, type = "histogram")
subplot(htemp, hcorr)
htemp <- plot_ly(uppsala_t[uppsala_t$place==1,], x = ~temp, type = "histogram", color = I("darkblue"), name = "Temperature")
hcorr <- plot_ly(uppsala_t[uppsala_t$place ==1,], x = ~temp, type = "histogram", color = I("tomato3"), name = "Corrected temperature") %>%
layout(yaxis = list(title = "", zeroline = FALSE, showline = FALSE, showticklabels = FALSE, showgrid = TRUE
))
subplot(htemp, hcorr)
Another way to visualise data is to create a boxplots. Lets keep the same colors as for the histograms and add a nice title to the plot. Note that with long names we need to alter the left margin in the boxplot or the name will be cut off. This is most easily done with the layout()
function. Many of the arguments, like the margin argument, in the layout()
accepts lists of values so one needs to make sure that one pass a list to the margin argument.
plot_ly(uppsala_t) %>%
add_boxplot(~temp, name = "Measured temperature") %>%
add_boxplot(~tempcorr, name = "Corrected temperature") %>%
layout(title = "Daily temperature in Uppsala 1722-2017", margin = list(l = 150))
A boxplot of the complete data is however not that informative, since it tells us nothing about patterns over years nor differences between years. To look at the former we can create a boxplot for every month.
plot_ly(uppsala_t, y = ~temp, x = ~factor(month), type = "box")
The axis names in former plot could be better to describe the actual data. One can also consider changing the month number to name, but lets skip that for now (or even better try to do this modification on your own)
xax <- list(type = "category",
title = "Month")
yax <- list(title = "Temperature")
plot_ly(uppsala_t, y = ~temp, x = ~factor(month), type = "box")%>%
layout(xaxis = xax, yaxis = yax)
First we create a new data set retaining a single mean temperature for every month.
ut <- uppsala_t%>%
group_by(year,month) %>%
summarise_at(vars(temp), funs(mean(., na.rm=TRUE)))
Then we can plot the average per month and highlight a year by adding a line in a different color.
p <- plot_ly(ut, x = ~month, y = ~temp)
add_lines(
add_lines(p, alpha = 0.2, name = "1722-2017", hoverinfo = "none", color = I("grey")),
name = "1722", data = filter(ut, year == 1722))%>%
add_lines(name = "2017", data = filter(ut, year == 2017), color = I("tomato3"))
For the last part try and create a useful and nice looking graphical animation of the temperature data. Make sure that the axis names reflect the data and add a nice title to the plot. To really add your signature to the animation pick a splashing color (or multiple colors).
sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] bindrcpp_0.2.2 crosstalk_1.0.0 leaflet_2.0.0
## [4] networkD3_0.4 dygraphs_1.1.1.4 ggiraph_0.4.2
## [7] plotly_4.7.1 highcharter_0.5.0 DT_0.4
## [10] formattable_0.2.0.1 kableExtra_0.9.0 forcats_0.3.0
## [13] stringr_1.3.1 dplyr_0.7.5 purrr_0.2.5
## [16] readr_1.1.1 tidyr_0.8.1 tibble_1.4.2
## [19] ggplot2_2.2.1.9000 tidyverse_1.2.1 captioner_2.2.3
## [22] bookdown_0.7 knitr_1.20
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-137 xts_0.10-2 lubridate_1.7.4
## [4] httr_1.3.1 rprojroot_1.3-2 tools_3.5.0
## [7] backports_1.1.2 R6_2.2.2 lazyeval_0.2.1
## [10] colorspace_1.3-2 withr_2.1.2 tidyselect_0.2.4
## [13] mnormt_1.5-5 curl_3.2 compiler_3.5.0
## [16] cli_1.0.0 rvest_0.3.2 xml2_1.2.0
## [19] officer_0.3.0 scales_0.5.0 psych_1.8.4
## [22] digest_0.6.15 foreign_0.8-70 rmarkdown_1.9
## [25] R.utils_2.6.0 base64enc_0.1-3 pkgconfig_2.0.1
## [28] htmltools_0.3.6 rvg_0.1.8 htmlwidgets_1.2
## [31] rlang_0.2.1 readxl_1.1.0 TTR_0.23-3
## [34] rstudioapi_0.7 quantmod_0.4-13 shiny_1.1.0
## [37] bindr_0.1.1 zoo_1.8-1 jsonlite_1.5
## [40] zip_1.0.0 R.oo_1.22.0 magrittr_1.5
## [43] rlist_0.4.6.1 Rcpp_0.12.17 munsell_0.4.3
## [46] gdtools_0.1.7 R.methodsS3_1.7.1 stringi_1.2.2
## [49] yaml_2.1.19 plyr_1.8.4 grid_3.5.0
## [52] promises_1.0.1 parallel_3.5.0 crayon_1.3.4
## [55] lattice_0.20-35 haven_1.1.1 hms_0.4.2
## [58] pillar_1.2.3 igraph_1.2.1 uuid_0.1-2
## [61] reshape2_1.4.3 glue_1.2.0 evaluate_0.10.1
## [64] data.table_1.11.4 modelr_0.1.2 httpuv_1.4.3
## [67] cellranger_1.1.0 gtable_0.2.0 assertthat_0.2.0
## [70] xfun_0.1 mime_0.5 xtable_1.8-2
## [73] broom_0.4.4 later_0.7.2 viridisLite_0.3.0
Page built on: 10-Jun-2018 at 07:56:36.
2018 | SciLifeLab > NBIS > RaukR