To do this exercise, we will use a separate dataset and it can be downloaded here.
If you would like to keep the data tree same as the exercises here, you can extract the zip file inside the data directory. So, the file tree would like this:
Before we go into ggmap
package and its options, let us take a quick look about some of the other options available for making maps in R.
Here we will see, how some of the traditional R packages maps
and mapdata
were used and how we could implement the information from these packages into ggplot
. Some pointers about these packages are:
Let us start with how to install these packages and look at their simple functions:
if (!requireNamespace("maps", quietly = TRUE)){
install.packages("maps", dependencies = TRUE)
}
if (!requireNamespace("mapdata", quietly = TRUE)){
install.packages("mapdata", dependencies = TRUE)
}
library(maps)
library(mapdata)
library(ggplot2)
library(dplyr)
library(reshape2)
library(tidyverse)
world <- map_data("world")
head(world)
output
## long lat group order region subregion
## 1 -69.89912 12.45200 1 1 Aruba <NA>
## 2 -69.89571 12.42300 1 2 Aruba <NA>
## 3 -69.94219 12.43853 1 3 Aruba <NA>
## 4 -70.00415 12.50049 1 4 Aruba <NA>
## 5 -70.06612 12.54697 1 5 Aruba <NA>
## 6 -70.05088 12.59707 1 6 Aruba <NA>
The structure of this data-set could be explained as follows:
long
is longitude. Things to the west of the prime meridian are negative.lat
is latitude.order
: This just shows in which order ggplot should “connect the dots”region
and subregion
tell what region or subregion a set of points surrounds.group
: This is very important! ggplot2’s functions can take a group argument which controls (amongst other things) whether adjacent points should be connected by lines. If they are in the same group, then they get connected, but if they are in different groups then they don’t.
To plot the map, in a pretty way:
gg1 <- ggplot(data = world, aes(x=long, y = lat)) +
geom_polygon(aes(group = group)) +
theme(legend.position = "none")
gg1
plot
Note New Zealand was not missed in this map ;)
geom_polygon()
drawn lines between points and “closes them up” (i.e. draws a line from the last point back to the first point)You can test it out how the data looks like without using the group aesthetic
gg1 <- ggplot(data = world, aes(x=long, y = lat)) +
geom_polygon() +
theme(legend.position = "none")
gg1
From the dataset we have, we can also draw the border lines representing the different countries based on the variable region
.
gg2 <- ggplot(data = world, aes(x=long, y = lat)) +
geom_polygon(aes(group = group, fill = region)) +
theme(legend.position = "none")
gg2
plot
Now, in this huge map, let us try to add some points like our favourite city Lund and few others.
labs <- data.frame(
long = c(13.1629767,11.7537556,16.3100205,80.0689272),
lat = c(55.7068448,57.7010982,48.220778,13.0478223),
names = c("LUND", "GBG", "Vienna", "Chennai"),
stringsAsFactors = FALSE
)
gg2 +
geom_point(data = labs, aes(x = long, y = lat), color = "black", size = 2)
plot
Note You can also import the points from a file and store in a data frame.
Unlike in normal ggplots, to zoom into a section of the map, if we try to use the xlim()
and ylim()
functions:
gg2 +
geom_point(data = labs, aes(x = long, y = lat), color = "black", size = 2) +
xlim(c(-30,40)) +
ylim(c(50,85))
plot
You can see that this subsection of the map is not very good, you can notice it especially in the area of Germany and Poland. This is because the latitude and longtitude does not really work like axes on a plot and the structure of the data-set we have here doesn’t support it that way.
We will go through a function in ggmap
to get one of the nicer ways to get a subsection in those measures. But, with the data we have here, it is possible get a subset of countries:
scandic <- subset(world, region %in% c("Sweden", "Denmark", "Norway", "Finland", "Iceland", "Faroe Islands"))
gg_scan <- ggplot(data = scandic, aes(x=long, y=lat)) +
geom_polygon(aes(group = group, fill = region))
gg_scan
plot
Let’s see how we can add a continous data like the World population statistics on to the above map. We can do this by importing the file with the population information and join it to the data set, we have with left_join()
pop_info <- read.table("data/ggmap_data/world_population.tsv", header = T, row.names = 1, sep = "\t")
head(pop_info)
output
## region PopMale PopFemale PopTotal PopDensity
## 1 Afghanistan 19976.26 18952.08 38928.34 59.627
## 2 Africa 669878.34 670719.77 1340598.11 45.216
## 3 African Group 668995.56 669831.03 1338826.59 45.571
## 4 African Union 669307.82 670116.10 1339423.92 45.182
## 5 African Union: Central Africa 79260.91 79358.73 158619.64 30.066
## 6 African Union: Eastern Africa 195662.62 196901.05 392563.67 61.404
Note Here the values are in the unit of thousands.
world_pop <- left_join(world, pop_info, by = "region")
gg_pop <- ggplot(data = world_pop, aes(x=long, y = lat, group = group, fill =PopTotal))
gg_pop + geom_polygon(color = "white") +
scale_fill_gradientn(colours = rev(rainbow(8)))
plot
ggmap
is an R package that makes it easy to retrieve raster map tiles from popular online mapping services like Google Maps and Stamen Maps and plot them using the ggplot2
framework:
You can find more information about the R package here
ggmap()
much as you would with ggplot()
The ggmap()
related tutorials here requires that you need to register for an API with google to be able to do this! Follow the instructions below to get an API:
Note To use Google maps, it is necessary to get an API key from google for your account and then register for an API key for example for a free trail. They might ask for your credit-card information! Then under credentials
you should be able to find your API key and copy it. Then after lpading ggmap
package in your R, you should able to register the key using the function register_google()
.
if (!requireNamespace("ggmap", quietly = TRUE)){
install.packages("ggmap", dependencies = TRUE)
}
library(ggmap)
register_google(<INSERT YOUR PERSONAL KEY HERE>)
sisquoc <- read.table("data/ggmap_data/sisquoc-points.txt", sep = "\t", header = TRUE)
head(sisquoc)
output
## name lon lat
## 1 a17 -119.7603 34.75474
## 2 a20-24 -119.7563 34.75380
## 3 a25-28 -119.7537 34.75371
## 4 a18,19 -119.7573 34.75409
## 5 a35,36 -119.7467 34.75144
## 6 a31 -119.7478 34.75234
Note ggmap
tends to use “lon” instead of “long” for longitude.
Now that we have the zoom region we are after, we can get the map from “google” using the get_map()
function. Then plot the points on the river from the data using geom_point()
get_map()
function requires:
location
: for which I prefer to give a centroid location meaning like one location point with latitude and longtitude. So, here one could take the mean of all the latitude and longtitude values.maptype
and source
: This is more upto your taste. You can find all the different options by ?get_map()
zoom
: This typically from the center point how much zoom do you need to acquire: In google maps, this is in the range of 3 (to a continent) to 21 (a house)ll_means <- sapply(sisquoc[2:3], mean)
sq_map2 <- get_map(location = as.array(ll_means), maptype = "satellite", source = "google", zoom = 15)
ggmap(sq_map2) + geom_point(data = sisquoc, mapping = aes(x = lon, y = lat), color = "red")
In here, we just managed to get an average of all the latitudes and the longtiudes in our data points that would give a centroid value and grabbed the map around the centroid.
Here we can also add some more information around these points
ggmap(sq_map2) +
geom_point(data = sisquoc, mapping = aes(x = lon, y = lat), color = "red", size = 3) +
geom_text(data = sisquoc, aes(x = lon, y = lat, label = paste(" ", as.character(name), sep="")), angle = 60, hjust = 0, color = "yellow")
How about if we use the “terrain” type of map:
sq_map3 <- get_map(location = ll_means, maptype = "terrain", source = "google", zoom = 15)
ggmap(sq_map3) +
geom_point(data = sisquoc, mapping = aes(x = lon, y = lat), color = "red", size = 3) +
geom_text(data = sisquoc, aes(x = lon, y = lat, label = paste(" ", as.character(name), sep="")), angle = 60, hjust = 0, color = "yellow")
Let us take the data points from a bike ride and see how we can plot it better:
run_df <- read.csv("data/ggmap_data/running.csv")
head(run_df)
output
## lat lon elevation time
## 1 55.72165 13.21811 74.2 2016-07-21T14:54:18Z
## 2 55.72163 13.21827 74.3 2016-07-21T14:54:27Z
## 3 55.72161 13.21841 74.4 2016-07-21T14:54:30Z
## 4 55.72159 13.21858 74.4 2016-07-21T14:54:33Z
## 5 55.72160 13.21877 74.5 2016-07-21T14:54:37Z
## 6 55.72165 13.21890 74.6 2016-07-21T14:54:40Z
Now let us try to plot the data:
run_means <- sapply(run_df[1:2], mean)
runmap1 <- get_map(location = run_means, maptype = "terrain", source = "google", zoom = 14)
ggmap(runmap1) +
geom_path(data = run_df, aes(x = lon, y = lat, color = elevation), size = 3, lineend = "round") +
scale_color_gradientn(colours = rev(rainbow(7)))
Note getting the right zoom and position for the map is sort of trial and error. You can go to google maps to figure out where the center should be (right click and choose “What’s here?” to get the lat-long of any point)
I found one of the simplest Swedish maps that could be used in R from Reinholdsson’s GitHub page Here, I will show you how to install it in your R and then how to use it! This package gives you the map-data for Sweden like the one we had with the world-map. The dtasets that are included are:
map_ln
: regions polygon datamap_kn
: municipalities polygon datacent_ln
: regions centroids datacent_kn
: municipalities centroids dataLet us take a quick look on how the map_ln
data looks like:
if (!requireNamespace("devtools", quietly = TRUE)){
install.packages("devtools", dependencies = TRUE)
}
if (!requireNamespace("swemaps", quietly = TRUE)){
devtools::install_github("reinholdsson/swemaps", dependencies = TRUE)
}
library(swemaps)
head(map_ln)
output
## lnkod order piece group ggplot_long ggplot_lat leaflet_long leaflet_lat
## 1 0001 1 1 0.1 674300.8 6580047 18.06328 59.32317
## 2 0001 2 1 0.1 674495.0 6581094 18.06754 59.33248
## 3 0001 3 1 0.1 673269.9 6580797 18.04579 59.33032
## 4 0001 4 1 0.1 672305.9 6580724 18.02882 59.33006
## 5 0001 5 1 0.1 671274.1 6581212 18.01110 59.33486
## 6 0001 6 1 0.1 670197.8 6582411 17.99315 59.34605
## lnnamn landareakm
## 1 Stockholms län 6487.9
## 2 Stockholms län 6487.9
## 3 Stockholms län 6487.9
## 4 Stockholms län 6487.9
## 5 Stockholms län 6487.9
## 6 Stockholms län 6487.9
And the map_kn
data:
head(map_kn)
output
## knkod order piece group ggplot_long ggplot_lat leaflet_long leaflet_lat lnkod
## 1 0114 1 1 0.1 665734.0 6599286 17.92782 59.49916 0001
## 2 0114 2 1 0.1 664503.5 6598352 17.90539 59.49127 0001
## 3 0114 3 1 0.1 663096.4 6598467 17.88067 59.49285 0001
## 4 0114 4 1 0.1 662865.2 6597495 17.87585 59.48422 0001
## 5 0114 5 1 0.1 660757.5 6596513 17.83794 59.47623 0001
## 6 0114 6 1 0.1 660952.3 6598920 17.84319 59.49774 0001
## knnamn lnnamn landareakm
## 1 Upplands Väsby Stockholms län 75.4
## 2 Upplands Väsby Stockholms län 75.4
## 3 Upplands Väsby Stockholms län 75.4
## 4 Upplands Väsby Stockholms län 75.4
## 5 Upplands Väsby Stockholms län 75.4
## 6 Upplands Väsby Stockholms län 75.4
It is quite clear from the beginning that map_kn
is more detailed than the map_ln
, as expected.
gg_swe <- ggplot(data = map_kn, aes(x=leaflet_long, y = leaflet_lat)) +
geom_polygon(aes(group = group, fill = lnnamn), color = "white") +
coord_fixed(1.5)
gg_swe
plot
Note Here we use coord_fixed()
for the first time, this is to control the dimentions of the axis to keep the map for in a proper shape than make it looked streched in one axis or the other.
Let us take a quick look on how and what actually is the cent_ln
data contain.
Here, let us take cent_ln
for example:
head(cent_ln)
output
## lnkod ggplot_long ggplot_lat leaflet_long leaflet_lat lnnamn
## 1 0001 688467.8 6624421 18.35107 59.71489 Stockholms län
## 2 0003 644137.1 6661613 17.58976 60.06625 Uppsala län
## 3 0004 596461.6 6548892 16.68244 59.06848 Södermanlands län
## 4 0005 541011.9 6469637 15.70100 58.36568 Östergötlands län
## 5 0006 464673.6 6375090 14.41027 57.51698 Jönköpings län
## 6 0007 473304.1 6295472 14.56286 56.80237 Kronobergs län
As you can see, it is basically the centroid points of each of the län. And similarly, cent_kn
contains the centroids of each kommun. We can for example use this datasets in this way:
gg_cen <- ggplot() +
geom_polygon(data = map_kn, aes(x=leaflet_long, y = leaflet_lat, group = group, fill = lnnamn)) +
coord_fixed(1.3) +
geom_text(data = cent_ln, aes(x=leaflet_long, y = leaflet_lat, label = lnnamn), color = "black", size = 3) +
theme(legend.position = "none") +
theme_nothing()
gg_cen
plot
Note It is important to remember that when you use multiple datsets, you can initate the plot by just ggplot()
and add the “data” and “aesthetics” in each of the geoms
you are plotting.
Now, similar to world-map let us try to extract a particular län, like our “Skåne”.
scania <- subset(map_kn, lnnamn %in% c("Skåne län"))
gg_scania <- ggplot(data = scania, aes(x=leaflet_long, y=leaflet_lat)) +
geom_polygon(aes(group = group, fill = knnamn)) +
coord_fixed(1.1) +
theme_void()
gg_scania
plot
Here I will show a quick demo of how to make interactive maps in R based on a package called leaflet
. This is the reason why swemaps
had the labels leaflet_long
and leaflet_lat
.
There is quite a lot of things you can do with this! You can follow the tutorial here. But here I will show you some of the very simple things.
if (!requireNamespace("leaflet", quietly = TRUE)){
devtools::install_github("rstudio/leaflet", dependencies = TRUE)
}
library(leaflet)
m <- leaflet() %>% setView(lng = 13.1914692, lat = 55.7040588, zoom = 12)
m %>% addTiles()
Now let us add some data to these maps. Let us first look into the discrete data that we have.
m <- leaflet(data = sisquoc) %>% setView(lng = -119.75132, lat = 34.75312, zoom = 15)
m %>% addTiles() %>%
addMarkers(~lon, ~lat, popup = ~as.character(name), label = ~as.character(name))
Now let us see if we can make it look prettier:
m <- leaflet(data = sisquoc) %>% setView(lng = -119.75132, lat = 34.75312, zoom = 15)
m %>% addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
addMarkers(~lon, ~lat, popup = ~as.character(name), label = ~as.character(name))
Note Here you can choose the different kinds of base maps available from the function addProviderTiles()
Let us see, if we can add the GPS recordings to in a similar interactive map.
m <- leaflet(data = run_df) %>% setView(lng = 13.21943, lat = 55.73118, zoom = 14)
m %>% addProviderTiles(providers$CartoDB.Voyager) %>%
addPolylines(~lon, ~lat)
Now, let us try to make something pretty with it.
m <- leaflet(data = run_df) %>% setView(lng = 13.21943, lat = 55.73118, zoom = 14)
m %>% addProviderTiles(providers$CartoDB.Voyager) %>%
addPolylines(~lon, ~lat, color = "blue") %>%
addCircleMarkers(data = run_df[1,], ~lon, ~lat, stroke = FALSE, fillOpacity = 0.5, label = "Start", color = "green") %>%
addCircleMarkers(data = run_df[667,], ~lon, ~lat, stroke = FALSE, fillOpacity = 0.5, label = "End", color = "red") %>%
addMarkers(data = run_df[seq(50,667, 50), ], ~lon, ~lat, popup = ~as.character(elevation), label = ~as.character(time))
Note Here I have used seq()
function to create a sequence of numbers to pick up the GPS positions from equal intervals. Then using that function I subset the dataframe run_df
in the last line of the command above. In the addCircleMarkers()
function I only used the first and the last line of the dataframe to specify the start and the end.
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] leaflet_2.0.4.1 swemaps_1.0 mapdata_2.3.0
## [4] maps_3.4.0 ggpubr_0.4.0 cowplot_1.1.1
## [7] ggthemes_4.2.4 scales_1.1.1 ggrepel_0.9.1
## [10] wesanderson_0.3.6 forcats_0.5.1 stringr_1.4.0
## [13] purrr_0.3.4 readr_2.1.1 tidyr_1.1.4
## [16] tibble_3.1.6 tidyverse_1.3.1 reshape2_1.4.4
## [19] ggplot2_3.3.5 formattable_0.2.1 kableExtra_1.3.4
## [22] dplyr_1.0.7 lubridate_1.8.0 yaml_2.2.1
## [25] fontawesome_0.2.2.9000 captioner_2.2.3 bookdown_0.24
## [28] knitr_1.37
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-153 fs_1.5.2 webshot_0.5.2
## [4] httr_1.4.2 tools_4.1.2 backports_1.4.1
## [7] bslib_0.3.1 utf8_1.2.2 R6_2.5.1
## [10] DBI_1.1.2 mgcv_1.8-38 colorspace_2.0-2
## [13] withr_2.4.3 gridExtra_2.3 tidyselect_1.1.1
## [16] compiler_4.1.2 cli_3.1.0 rvest_1.0.2
## [19] xml2_1.3.3 labeling_0.4.2 sass_0.4.0
## [22] systemfonts_1.0.3 digest_0.6.29 rmarkdown_2.11
## [25] svglite_2.0.0 pkgconfig_2.0.3 htmltools_0.5.2
## [28] dbplyr_2.1.1 fastmap_1.1.0 highr_0.9
## [31] htmlwidgets_1.5.4 rlang_0.4.12 readxl_1.3.1
## [34] rstudioapi_0.13 jquerylib_0.1.4 farver_2.1.0
## [37] generics_0.1.1 jsonlite_1.7.3 crosstalk_1.2.0
## [40] car_3.0-12 magrittr_2.0.1 Matrix_1.3-4
## [43] Rcpp_1.0.8 munsell_0.5.0 fansi_1.0.2
## [46] abind_1.4-5 lifecycle_1.0.1 stringi_1.7.6
## [49] carData_3.0-5 plyr_1.8.6 grid_4.1.2
## [52] crayon_1.4.2 lattice_0.20-45 haven_2.4.3
## [55] splines_4.1.2 hms_1.1.1 pillar_1.6.4
## [58] ggsignif_0.6.3 reprex_2.0.1 glue_1.6.0
## [61] evaluate_0.14 leaflet.providers_1.9.0 modelr_0.1.8
## [64] vctrs_0.3.8 tzdb_0.2.0 cellranger_1.1.0
## [67] gtable_0.3.0 assertthat_0.2.1 xfun_0.29
## [70] broom_0.7.11 rstatix_0.7.0 viridisLite_0.4.0
## [73] ellipsis_0.3.2