In this tutorial, we will examplify some model testing approaches using R. We will use nested multiple logistic regression models on the previously used GWAS data; since this use case borders on feature selection, we will also touch upon both frequentist and Bayesian approaches for this.
We will re-use the data from the GWAS workshop, but for practice purpose analyze it using logistic regression.
We start by extracting the genotypes and phenotypes as data-frames from the GenAbel data format. For simplicity we will assume a linear coding of genotypes and assume an additive association model – notice that this may or may not be the optimal choice. We save them in a joint data-frame, mydata
.
Several model comparison tests (lrtest
, anova
) in R require that the compared model are estimated on exactly the same data set; this means that some variables has missing values, which are excluded in the model fitting, will cause problems. For simplicity, we here therefore exclude all rows with missing values in any SNP.
We will make use of the top 20 best SNPs from the results in the GWAS workshop. We save the names of these in the vector top_snps
.
As this procedure may take some time, I have saved the resulting mydata
and top_snps
in the R-file mydata.R
, which can be loaded into R directly. This file should be available in a download link on the course programme page. Download it, load it into R, and look at the first entries of mydata
and top_snps
.
myRfile="data/mydata.R" # change this to fit the location where you download the R-file
if(!file.exists(myRfile)){
#extract as data.frame
genot=as.data.frame(as.numeric(gtdata(genodata.qc0)))
phenot=as.data.frame(genodata.qc0)
top_snps=an@results[,"P1df", drop=FALSE]
top_snps=data.frame(rownames(top_snps),top_snps)[order(top_snps$P1df),"rownames.top_snps."][1:20]
mydata=merge(genot[,names(genot) %in% top_snps], phenot, by="row.names")
row.names(mydata)=mydata$Row.names
mydata[,c("id","Row.names")]=NULL
# remove rows with missing values
mydata=mydata[complete.cases(mydata),]
save(mydata,top_snps, file=myRfile)
}else{
load(myRfile)
}
head(mydata)
head(top_snps)
## [1] rs12629868 rs6444131 rs4541401 rs10871764 rs11919115 rs2665749
## 727777 Levels: AFFX-SNP_10105880__rs3737108 ... SNP_A-2280133
We want to investigate how the top SNPs identified in the GWAS workshop can be combined into a multivariate model. Do we need all SNPs or just a subset? How can we evaluate which subset is best?
Logistic regression is an appropriate model to model \(Y\) that takes values that are either 0 or 1, and is therefore typically used for case-control data, as in our case.
Formally, logistic regression is a generalized linear model (GLM) with a logit link function and a binomial dispersion function. Here we don’t model the outcome directly, but rather the log odds of the outcome:
\[\log \frac{Pr[y=1]}{1-Pr[y=1]}\sim\beta_0+ \beta_1 x_1 +\beta_2 x_2 + \ldots, \beta_n x_n\] where \(\log \frac{Pr[y=1]}{1-Pr[y=1]}\) is the logit-function of \(Pr[y=1]\). More generally we can write
\[logit(Pr[Y]) \sim X\boldsymbol{\beta}\] Even more formally, the \(logit(Y)\) is assumed to be distributed around \(X\boldsymbol{\beta}\) according to a Binomial distribution (this is the dispersion function).
R has a function glm
that is the GLM version of lm
function for linear models. To get glm
to model a logistic function, set the argument family=binomial(link=logit)
. Look up the help for glm
and try to create a list of 10 GLMs, which includes an increasing number of SNPs from the top_snps
list and with sex
as a covariate. Print each GLM.
models = list()
for(n in 1:20){
models[[n]] = glm(as.formula(paste("pheno ~ sex + ", paste(top_snps[seq(1,n)], collapse=" + "))), data=mydata, family=binomial(link=logit))
print(paste0("M", n))
print(models[[n]]$formula)
}
## [1] "M1"
## pheno ~ sex + rs12629868
## [1] "M2"
## pheno ~ sex + rs12629868 + rs6444131
## [1] "M3"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401
## [1] "M4"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764
## [1] "M5"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115
## [1] "M6"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749
## [1] "M7"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834
## [1] "M8"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228
## [1] "M9"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513
## [1] "M10"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672
## [1] "M11"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686
## [1] "M12"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345
## [1] "M13"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133
## [1] "M14"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955
## [1] "M15"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585
## [1] "M16"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585 + rs2250795
## [1] "M17"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585 + rs2250795 + rs9450749
## [1] "M18"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585 + rs2250795 + rs9450749 + rs8031841
## [1] "M19"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585 + rs2250795 + rs9450749 + rs8031841 + rs1279446
## [1] "M20"
## pheno ~ sex + rs12629868 + rs6444131 + rs4541401 + rs10871764 +
## rs11919115 + rs2665749 + rs2036834 + rs7912228 + rs10080513 +
## rs310672 + rs9450686 + rs6777345 + rs17768133 + rs13316955 +
## rs6909585 + rs2250795 + rs9450749 + rs8031841 + rs1279446 +
## rs7671868
We will start to compare our models using a likelihood ratio test (LRT)
\[LR=\frac{mL[M_{i-1}|X,Y]}{mL[M_{i}|X,Y]},\]
where \(mL[M_i|X,Y]\) is the maximum Likelihood of model \(M_i\). Recall that \(2\log L\sim \chi^2(\delta)\), where \(\delta\) is the difference in dimension (“number of variabels) between the models; this provides us with a significance test.
We will use the lrtest
function in the lmtest
package to perform likelihood ratio tests (LRT) sequentially between model with increasing number of variables. One way of doing this could be to loop through the different model comparisons, store the results in a data.frame and then display it in a table. Try to implement this.
library(lmtest)
lrt=data.frame(models="1 variable", added_variable=top_snps[1],llprev=NA, llnew=NA, lr=NA, P = NA, significant=NA,stringsAsFactors=FALSE) #data.frame to store results in
for(i in seq(2,20)){
fit=lrtest(models[[i-1]], models[[i]]) # perform lrtest
p=fit$`Pr(>Chisq)`[2] # alias for p-value (for convenience of typing)
lr=fit$LogLik[1]-fit$LogLik[2] # compute LR
signif = ifelse(p > 0.05,"-",ifelse(p > 0.01,"*",ifelse(p > 0.005, "**","***"))) # significance level
# store result
lrt[i,] = list(paste0(i," variables"), top_snps[i], signif(fit$LogLik[1], 5), signif(fit$LogLik[2], 5), format(lr, digits=4), format(fit$`Pr(>Chisq)`[2],digits=3, scientific=-1), signif)
}
lrt[order(as.numeric(lrt$P), na.last=T),]
However, we find that lrtest
, in the spirit of R
, can take more than two models as argument, e.g., lrtest(models[[1]], models[[2]], models[[3]])
. We can therefore perform exactly the sequential test we had in mind in one function call. How would you do that? (tip: look up the do.call
function.
# Since we have fitted models in a list, we make use of the do.call function, which
# extracts the elements in the list and sends them as arguments to the lrtest call.
lrt2=do.call("lrtest", models)
lrt2
Moreover, there are other functions that can perform LRTs. The anova
function, when presented with more than one model, peforms an analysis of deviance test; since the deviance is equivalent to the LR, this test is equivalent to a LRT. Try to perform an analysis of deviance using the anova function (tip: to get p-values, you need to pass the argument test=Chisq" to
anova``)
# If we want p-values, we need to pass the named argument ``test`` to anova; we therefore
# add this to our list of arguments (i.e., the list of models)
do.call("anova", c(models, list("test"="Chisq")))
We will wait a while before evaluating and interpretating the result.
LRT only makes use of the relative likelilhood when comparing models. Often the aim is to also to try to find as small a model as possible, i.e., to balance the likelihood against having as few variables as possible. The Akakike Information Content (AIC) provides one such balance. The AIC for a model \(m\) is
\[AIC_m = 2\#(\boldsymbol{\beta}\neq 0)-2\log mL[\boldsymbol{\beta}|X,Y]\] As we saw in the lecture, this is actually a penalized LRT. The aim is to select the model with minimum \(AIC\).
We can compare the Akaike information index (AIC) between out models using the AIC
function in the stats
package. Try to look up help for AIC
and perform such analysis (tip: For some unknown reason, do.call on AIC returns garbled-up AIC return value, but for some other unknown reason, it suffices to set the row.names of this return value to `NULL
!?).
my.aic <- function(x) {
x <- do.call(AIC, x)
rownames(x) <- NULL
return(x)
}
aic=my.aic(models)
aic
If we look at the output from the glm
function from each model, we see that glm
actually computes AIC, so we could just list these:
for(i in seq(1,20)){
print(paste0("M",i,": aic = ",models[[i]]$aic))
}
## [1] "M1: aic = 191.413826142946"
## [1] "M2: aic = 191.347340504348"
## [1] "M3: aic = 193.134436330627"
## [1] "M4: aic = 176.762372070646"
## [1] "M5: aic = 178.663456001373"
## [1] "M6: aic = 170.538222942918"
## [1] "M7: aic = 169.411055539535"
## [1] "M8: aic = 157.4572711016"
## [1] "M9: aic = 151.317844935567"
## [1] "M10: aic = 145.029287082904"
## [1] "M11: aic = 143.961602761146"
## [1] "M12: aic = 145.956326173936"
## [1] "M13: aic = 144.684882688072"
## [1] "M14: aic = 146.682762084756"
## [1] "M15: aic = 148.142792468267"
## [1] "M16: aic = 149.23617930652"
## [1] "M17: aic = 151.230782385327"
## [1] "M18: aic = 130.240122525795"
## [1] "M19: aic = 121.55817408892"
## [1] "M20: aic = 119.609696365098"
If we plot the log likelihoods (significant changes marked by *
) and AICs, we see that the “best” model overall seems to be the one including all 20 SNPs, in that the addition of variable 20 improves likelihood significantly and this model has the overall minimal AIC.
However, we also notice that:
par(mfrow=c(1,2))
plot(lrt2$LogLik, pch=ifelse(lrt2$`Pr(>Chisq)`<0.05, 8,1))
legend("bottomright",legend=c('signif', 'not'), pch=c(8,1))
plot(aic$AIC)
Try to think about why this is so and why this could be a problem (then click the code button)?
# We add variables sequentially according to their significance level in the single
# SNP GWAS analysis. However, this appears not to be the optimal sequence to add the
# variables in a multivariate model.
#
# Variables 2 and 3 probably reflects the same signal as variable 1, perhaps because
# the three SNPs are in LD, hence the SNP appears significant in the single SNP
# analysis, but only marginally improves likelihood in the multivariate analysis,
# and only have very marginally, or even worse AIC.
#
# Variable 4 is likely to represent a new independent signal (significantly
# better likelihood and lower AIC), while variable 5 does not contribute anything
# new, and so on.
#
# This means that our "best" model, with all 20 SNPs included, contains a lot of
# unnecessary variables, which is what we want to avoid. In fact, finding the
# smallest best model is the whole idea with AIC, but if we should have stopped
# once the AIC went up, we would have missed, e.g., variable 4, which seems
# clearly relevant to include in the model.
#
# Hence, balancing maximization of likelihood and minimization of model complexity
# (i.e., the number of variables) is not the whole solution. We obviously need to
# think about the order in which variables are added to the model.
Think about how can we solve this problem (then click the code button)?
# There are some different approaches one could use
#
# 1. BEST SUBSET REGRESSION
# The raw force solution would be to test all possible models that include subsets
# of our 20 variables in different combinations and test them against each other,
# e.g., using AIC. However, this "best subset" approach becomes too cumbersome as
# the number of possible variables to include grows. For our case, there are
# 2^20=1048576 possible models to test against each other (549755289600 tests).
#
# 2. STEPWISE REGRESSION
# One could use our original approach, but alter it so that each time an added
# variable yields a model with a insignificantly better likelihood or worse AIC,
# we simply discard ity and continue with the next variable.
# This is very close to the:
# 2a. FORWARD SELECTION variant of stepwise regression
# Here, instead at each step, all remaining variables are tested for addition
# to the current model and the one with the best, significant, improvement (if
# any) is selected for addition.
# 2b. BACKWARD ELIMINATION variant of stepwise regression
# Here instead, we start with all variables included and then try to remove one
# of them, trying all variants and selecting the one with the least, insignificant,
# change in likelihood/AIC (if any) is chosen for removal.
#
# 3.FEATURE SELECTION -- LASSO
# Combine regularization on the summed effect sizes with an efficient algorithm
# for variable addition. This is one of the situations that lasso was designed for.
#
# 4. BAYESIAN MODELS
# The Bayesian approach would be to include all variables but apply prior
# distributions on models and model parameters (i.e., effect sizes, betas)
A note on random seed: The functions below are probabilistic and will use random starting points, so the exact values for certain output may vary a little between subsequent runs. By providing a seed
to R’s random number generator, we force the analyses to be performed the same way each time. To obtain exactly the same result as in this document set the seed to 85.
set.seed(85)
Lasso search the model space and uses a regularization approach to select the best (nested) model. The optmization criterion can be written, e.g.:
\[min_{\boldsymbol{\beta}}\left\{\frac{1}{N} ||Y-X\boldsymbol{\beta}||_2^2\right\} \textrm{subject to} ||\boldsymbol{\beta}||_1\leq \lambda\] i.e., Least squares with a limit on the \(L_1\) norm of \(\boldsymbol{\beta}\) as a auxiliary criterion.
Try to implement a LASSO analysis of our data. There are (at least) two packages that implement LASSO, lars+covTest
and glmnet
. However, the lars and covTest implementation for GLMs appears to be unstable, so use glmnet instead. glmnet
uses a cross-validation approach to select the optimal \(\lambda\) to use (\(\lambda_{min}\)). Look up the help page for glmnet
and try to identify \(\lambda_{min}\) and the effect sizes \(\beta_i\) for the different variables \(i\) at \(\lambda_{min}\). Also plot the coefficient profile for \(\boldsymbol{\beta}\) as a function the lasso-path of \(\lambda\) and the cross-validation curve for estimating \(\lambda_{min}\). (tip: preferably use the deviance
as the type-measure
when estimating \(\lambda_{min}\) as some of the alternatives – class
– gives unstable results).
library(covTest)
# glmnet wants input with predictors x and outcome y separated
# so let's extract them from my data.
x=as.matrix(mydata[,-which(names(mydata) == "pheno")], rownames=1)
y=mydata$pheno
# run the lasso analysis
glasso=glmnet(x,y,family="binomial")
# plot trace of effect sizes as variables are added with decreasing lambda
plot(glasso, xvar="norm",label=T)
# use cross validation to estimate optimal lambda
glasso_fit <- cv.glmnet(x, y, family="binomial", nfolds=50, type.measure="deviance")
plot(glasso_fit)
# extract the estimated coefficient under optimal lambda
coef=coef(glasso_fit, s = "lambda.min")
paste0("Optimal lambda = ", signif(glasso_fit$lambda.min, 5), " which selects a model with ",length(coef@i)," variables")
# tabulate the coefficients in decreasing order -- tricky extraction
# from the cv.glmnet class object!
effects=data.frame(snp=coef@Dimnames[[1]][coef@i+1], beta=coef@x)
effects[order(-abs(effects$beta)),]
## [1] "Optimal lambda = 0.012919 which selects a model with 16 variables"
Let’s also try a Bayesian approach to feature selection. The bas.glm
function in the BAS
package uses a Bayesian Adaptive Sampling algorithm (or MCMC) for variable selection in GLMs. It explores the landscape of possible models and assigns posterior probabilities for each variable to be included in the model, given the data.
The posterior probability distribution over models \(M\) and their associated variable effect sizes \(\boldsymbol{\beta}\) is
\[Pr[M, \boldsymbol{\beta}|Y, X] = \frac{Pr[Y|M, \boldsymbol{\beta}, X]Pr[\boldsymbol{\beta}|M]Pr[M]}{Pr[Y|X]}\] where \(Pr[\boldsymbol{\beta}|M]\) and \(Pr[M]\) are prior distributions over effect sizes and models, respectively.
The posterior inclusion probability for variable \(i\) is the probabiity that its effect size \(\beta_i \neq 0\) (remember that our nested model can be described in terms of a more general model by constraining some parameters, e.g., \(\beta_i\), to \(0\)). This probability can be computed by averaging the event \(\beta_i \neq 0\) over the posterior distribution, \[Pr[\beta_i \neq 0|Y,X] = \sum_{M}\int_{\boldsymbol{\beta}} I(\beta_i\neq 0)Pr[M, \boldsymbol{\beta}|Y, X] d\boldsymbol{\beta} dM,\] where the indicator function \(I(\beta_i\neq 0) = 1\) if \(\beta_i\neq 0\) and \(0\) otherwise.
bas.glm
provides a few choices of \(Pr[\boldsymbol{\beta}|M]\) (including AIC and BIC) and \(Pr[M]\). Read the help page for bas.glm
and try to perform a Bayesian variable selection using bas.glm
(tip: as bas.glm
takes some time searching through the model/variable space, it might be a good idea to reduce the number of possible variables in the GLMs model by, say, only using the top 5 SNPs in top_snps
while trying out your code – once your code is stable, you can apply it to the full model).
library(BAS)
nvars = 20 # change to lower values when testing
f = as.formula(paste("pheno ~ sex + ", paste(top_snps[seq(1,nvars)], collapse=" + ")))
#
bayes=bas.glm(f, data=mydata, family=binomial(link=logit),betaprior=aic.prior(), modelprior=uniform())
bayes
bayes$n.vars
## [1] "Number of models is BIG -this may take a while"
##
## Call:
## bas.glm(formula = f, family = binomial(link = logit), data = mydata,
## betaprior = aic.prior(), modelprior = uniform())
##
##
## Marginal Posterior Inclusion Probabilities:
## Intercept sex rs12629868 rs6444131 rs4541401 rs10871764
## 1.0000 0.2851 0.8308 0.3979 0.4242 0.6146
## rs11919115 rs2665749 rs2036834 rs7912228 rs10080513 rs310672
## 0.4539 0.5992 0.6355 0.9630 0.3464 0.6523
## rs9450686 rs6777345 rs17768133 rs13316955 rs6909585 rs2250795
## 0.5103 0.4058 0.6316 0.2848 0.4162 0.3025
## rs9450749 rs8031841 rs1279446 rs7671868
## 0.3297 0.9992 0.9933 0.8261
## [1] 22
Which variables would you consider important for the model?
This is the end of this file, but not the end of the lab/tutorial/workshop. For further applications of Lasso and other feature selection methods, please now go on to do items 9-12 in the General Statistics lab.
There are no course lab on integrative methods, as it is a quite specialized issue. However, those interested can try out the MixOmics
tutorials for their MINT
and DIABLO
methods. Also look at the descriptions of MINT
and DIABLO
.
sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
##
## locale:
## [1] sv_SE.UTF-8/sv_SE.UTF-8/sv_SE.UTF-8/C/sv_SE.UTF-8/sv_SE.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] BAS_1.5.0 covTest_1.02 MASS_7.3-50 glmpath_0.98
## [5] survival_2.42-3 glmnet_2.0-16 foreach_1.4.4 Matrix_1.2-14
## [9] lars_1.2 lmtest_0.9-36 zoo_1.8-1 kableExtra_0.9.0
## [13] forcats_0.3.0 stringr_1.3.1 dplyr_0.7.5 purrr_0.2.5
## [17] readr_1.1.1 tidyr_0.8.1 tibble_1.4.2 ggplot2_2.2.1
## [21] tidyverse_1.2.1 captioner_2.2.3 bookdown_0.7 knitr_1.20
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.17 lubridate_1.7.4 lattice_0.20-35
## [4] assertthat_0.2.0 rprojroot_1.3-2 digest_0.6.15
## [7] psych_1.8.4 R6_2.2.2 cellranger_1.1.0
## [10] plyr_1.8.4 backports_1.1.2 evaluate_0.10.1
## [13] httr_1.3.1 pillar_1.2.3 rlang_0.2.1
## [16] lazyeval_0.2.1 readxl_1.1.0 rstudioapi_0.7
## [19] rmarkdown_1.10 splines_3.4.4 foreign_0.8-70
## [22] munsell_0.5.0 broom_0.4.4 compiler_3.4.4
## [25] modelr_0.1.2 xfun_0.1 pkgconfig_2.0.1
## [28] mnormt_1.5-5 htmltools_0.3.6 tidyselect_0.2.4
## [31] codetools_0.2-15 viridisLite_0.3.0 crayon_1.3.4
## [34] grid_3.4.4 nlme_3.1-137 jsonlite_1.5
## [37] gtable_0.2.0 magrittr_1.5 scales_0.5.0
## [40] cli_1.0.0 stringi_1.2.3 reshape2_1.4.3
## [43] bindrcpp_0.2.2 xml2_1.2.0 iterators_1.0.9
## [46] tools_3.4.4 Cairo_1.5-9 glue_1.2.0
## [49] hms_0.4.2 parallel_3.4.4 yaml_2.1.19
## [52] colorspace_1.3-2 rvest_0.3.2 bindr_0.1.1
## [55] haven_1.1.1
Page built on: 19-Jun-2018 at 21:56:15.
2018 | SciLifeLab > NBIS > RaukR