Installation for this session can be achieved by following two alternatives.
Navigate to the “session_meta/” directory of the workshop and run following code chunk.
## make sure that the renv.lock file for the session exists and do following.
install.packages("renv")
::restore(lockfile = "./Meta_tutorial/renv_r_4_1_0.lock", prompt = TRUE) renv
########### Rstudio image for meta analyses ###########
# Your user is 'omics' (without the quotes)
# Replace <yourpassword> with your desired password
docker run -d --rm -p 8787:8787 -e PASSWORD=<yourpassword> ash706/omicsint_r
Meta analyses are commonly used in clinical studies to assess the effect of a treatment or a genetic locus on a phenotype. With the advent of GWAS (Genome wide association studies), various studies report summary statistics on associated effects of genetic loci on phenotype. Meta analyses can be useful in determining whether the effect size is consistent across the body of data
The goal of a synthesis is to understand the results of any study in the context of all the other studies. First, we need to know whether or not the effect size is consistent across the body of data. If it is consistent, then we want to estimate the effect size as accurately as possible and to report that it is robust across the kinds of studies included in the synthesis. On the other hand, if it varies substantially from study to study, we want to quantify the extent of the variance and consider the implications. Meta-analysis is able to address these issues whereas the narrative review is not. We start with an example to show how meta-analysis and narrative review would approach the same question, and then use this example to highlight the key differences between the two.
Note: The assumption of uniformly distributed P-values under the null hypothesis or can be done non-parametrically by permutation-based analysis
In addition to statistical methods mentioned above, a number of factors need to be considered for the choice of method. Different test statistics maybe used depending on the type of outcome variable (e.g. t-statistic or moderated t-statistic for binary outcome, F-statistic for multi-class outcome, regression or correlation coefficient for continuous outcome and log-rank statistic for survival outcome).
Here we will be using MetaDE package that has implemented above mentioned methods.
A summary of the methods and their implementations for respective outcome variables are listed in the table below as presented in the original publication.
For a detailed review of the applied methods for meta analyses, related benchmarks and
library(magrittr)
library(plyr)
library(preproc)
library(MetaQC)
library(MetaDE)
library(MetaPath)
library(MetaDCN)
I this session we will be working with the “prostate8.rda” data set that is located in the data directory of the “session_meta” directory of the github repository for the course.
Some of the R packages packages we are using here are not available through conda repositories and therefore need to be install from the github repositories for the respective packages
The session comprises of following main steps
We will using prostate dataset
Load the protaste data set
load(file = "./data/prostate8.rda")
The prostate data is comprised of 8 microarray studies from different microarray platforms. Here it is provided in a list format where data matrices and corresponding labels are provided.
Let us take a quick look at the data.
names(prostate8)
## [1] "data" "dataLabel"
We can have a look at the structure of each data sets and evaluate what sort of filtering steps we need to take.
library(magrittr)
lapply(prostate8$data, dim) %>% as.data.frame(row.names = c("genes", "samples"))
## Welsh Yu Lapointe Varambally Singh Wallace Nanni Tomlins
## genes 8798 8799 13579 19738 8799 12689 12688 9703
## samples 34 146 103 13 102 89 30 57
As we can see, the studies have different number of genes measured and contain different number of samples. In a meta analyses, studies can have different samples sizes but genes should match between different studies. To be able to merge data sets, all data matrices should have the same annotations
Let us look at the row names which are already set to gene names.
lapply(prostate8$data, row.names) %>% lapply(.,head, 5) %>% as.data.frame()
## Welsh Yu Lapointe Varambally Singh Wallace Nanni Tomlins
## 1 ACTB MAPK3 ITGB2 DDR1 MAPK3 DDR1 DDR1 ZFX
## 2 GLRA1 TIE1 HMBS PAX8 TIE1 RFC2 RFC2 PIGS
## 3 KCNB2 CXCR5 GATA6 THRA CXCR5 PAX8 HSPA6 ZPBP
## 4 MGAT5 DUSP1 ICAM5 CCL5 DUSP1 ESRRA PAX8 NOTCH3
## 5 BMP3 MMP10 RIN2 ESRRA MMP10 GAS6 PTPN21 IGSF11
It is clear that the gene names are set to gene symbols for all data sets and we can merge theses data set.
The second element of the lists corresponds to data labels where each value corresponds to disease status of the individual where the sample came from.
$dataLabel %>% lapply(. ,table) %>% as.data.frame() %>% t prostate8
## [,1] [,2]
## Welsh.Var1 "0" "1"
## Welsh.Freq " 9" "25"
## Yu.Var1 "0" "1"
## Yu.Freq "81" "65"
## Lapointe.Var1 "0" "1"
## Lapointe.Freq "41" "62"
## Varambally.Var1 "0" "1"
## Varambally.Freq "6" "7"
## Singh.Var1 "0" "1"
## Singh.Freq "50" "52"
## Wallace.Var1 "0" "1"
## Wallace.Freq "20" "69"
## Nanni.Var1 "0" "1"
## Nanni.Freq " 7" "23"
## Tomlins.Var1 "0" "1"
## Tomlins.Freq "27" "30"
Data labels indicate “0” as control and “1” as disease groups. We can set these values to character format for convenience.
If everything looks good, we are good to go to the next step.
Take a moment to explore the data in your own ways and fimilarise yourself wit it.
Usually different microarray platforms use their own probe IDs or experiments from different omics platforms can have annotations for transcripts of proteins. To perform meta-analysis, one needs to match probe/transcript IDs from different platforms to the unique official gene ID, such as ENTREZ ID or gene symbol.
Options for situations for microarray data include
We do not cover the pre-processing steps for different omics technologies at the moment but the workshop participants are encouraged to apply the domain specific knowledge when setting up a meta analyses study.
::metaOverlap(prostate8$data) %>%
MetaQClapply(dim) %>% as.data.frame(row.names = c("genes", "samples"))
## Welsh Yu Lapointe Varambally Singh Wallace Nanni Tomlins
## genes 4241 4241 4241 4241 4241 4241 4241 4241
## samples 34 146 103 13 102 89 30 57
As you can see here, only 4241 genes are present in all studies and if we are to filter out this way, we lose a lot of data.
Biologically, it is likely that most genes are either un-expressed or un-informative. In gene expression analysis to find DE genes, these genes contribute to the false discoveries, so it is desirable to filter out these genes prior to analysis. After genes are matched across studies, the unique gene symbols are available across all studies. Two sequential steps of gene filtering can be performed. In the first step, we filter out genes with very low gene expression that are identified with small average expression values across majority of studies. And then we can remove genes that are not variable in your data sets using variance estimates as they are not useful in comparisons.
You can take a look at preproc package for some of the functions
available for filtering the data and the intuition behind the methods.
Here Annotate(), Impute(), Filter() and Merge()
maybe useful for
pre-processing steps of the data analyses.
<- prostate8$data
data2 <- preproc::Merge(data2)
data2 = rep("microarray", length(data2)) # a character vector for each study type
data.type
<- preproc::Filter(data2, del.perc = c(0.1, 0.1),data.type = data.type ) # Here we specify the percentage of genes to be filtere, and specify the type of data for each study
data2_filt
<- rbind(as.data.frame(lapply(prostate8$data, dim))[1,],
summary_preproc as.data.frame(lapply(data2, dim))[1,],
as.data.frame(lapply(data2_filt, dim))
)
rownames(summary_preproc) <- c("orginal", "merged", "mergeFiltered", "samples")
summary_preproc
## Welsh Yu Lapointe Varambally Singh Wallace Nanni Tomlins
## orginal 8798 8799 13579 19738 8799 12689 12688 9703
## merged 4241 4241 4241 4241 4241 4241 4241 4241
## mergeFiltered 3434 3434 3434 3434 3434 3434 3434 3434
## samples 34 146 103 13 102 89 30 57
Note that the multiple gene expression data sets may not be very well aligned by genes, and the number of genes in each study maybe different. When we combine a large number of studies, the number of common genes may be very small, so we need to allow for genes appearing in most studies and missing in few studies etc.
Let us perform quality control of the data for this meta analyses. We are using the “MetaQC” that identifies ways to objectively perform quality control for the microarray studies.
load(file = "./data/pathways.rda")
=prostate8$data
DList=prostate8$dataLabel
colLabel#GList=pathway[[1]]
#GList=pathwayDatabase
=TRUE
filterGenes=0.3 #
cutRatioByMean=0.3
cutRatioByVar#tic() not run
=MetaQC(DList, colLabel, GList=c(Hallmark.genesets, KEGG.genesets, Immunologic.genesets),filterGenes,cutRatioByMean,cutRatioByVar ) ## This will take some time depending on the number of studies and the type of data
QCresult#toc()
If you are unable to run the above steps, please load the “QC.rda” from the results directory of the session.
$scoreTable QCresult
## IQC EQC AQCg AQCp CQCg CQCp
## Welsh 4.6297212 5.0993305 28.269084964 0.000000 1.671669e+02 1.434671e+02
## Yu 9.4947631 9.4876742 21.725636675 0.000000 1.594930e+02 1.383082e+02
## Lapointe 3.5445936 3.9678407 24.326120224 0.000000 9.159887e+01 2.045790e+01
## Varambally 4.2482870 3.7655707 4.272765763 0.000000 1.858428e+01 3.795110e+01
## Singh 0.8946880 2.0302734 14.740646685 6.945533 4.447641e+01 5.839619e+01
## Wallace 8.1666052 8.9729647 0.003464049 0.000000 4.453816e-05 3.183405e-04
## Nanni 0.8134973 0.6480432 0.000000000 0.000000 3.324084e-01 2.562951e-04
## Tomlins 0.9366496 0.4514895 0.984994108 0.000000 7.892800e+00 1.767292e+01
IQC, EQC, AQCg, AQCp, CQCg, CQCp
Internal quality control index:small IQC indicated that the study had heterogeneous coexpression structure with other studies and was considered a candidate problematic study that should be excluded from meta-analysis
the external quality control (EQC): small EQC indicated that the study had low association with pathway in terms of gene pairwise correlation structure and maybe considered a candidate problematic study.
accuracy quality control (AQC) and a consistency quality control (CQC).
Large AQCg measure for a given study indicate that DE genes produced by study were reproducible compared to DE genes detected by meta-analysis excluding study
Having a large CQCg measure for a given study indicated that DE evidence produced by study was consistent with DE evidence generated by meta-analysis excluding study. Please read the original MetQC publication for a better understanding of the measures.
We can now look at the two dimensional biplot based on PCA analyses and get an idea of any studies that may not be of great quality.
::plotMetaQC(QCresult$scoreTable) MetaQC
Although the plotMetaQC()
function is a useful tool for visualization,
it does not necessarily allow automatic inclusion/exclusion criteria.
Note that our visualization and summarization tools were not meant for an automated recommendation for inclusion/exclusion decision. In the examples we explored, there were roughly three categories in the QC results: definite exclusion cases with poor quality, definite inclusion cases with good quality and borderline cases.
Here Nanni and Tomlins are the two studies that maybe of low quality and should perhaps be excluded from the analyses.
=TRUE
filterGenes=0.3 #
cutRatioByMean=0.3
cutRatioByVar<- c("Nanni", "Tomlins")
to_remove <- list(data = within(prostate8$data, rm(Nanni, Tomlins)), dataLabel=within(prostate8$dataLabel, rm(Nanni, Tomlins)))
prostate6 <- list(data = MetaQC::metaOverlap(prostate6$data), dataLabel=prostate6$data)
prostate_fil $data <-MetaQC::metaFilterData(prostate_fil$data,cutRatioByVar = cutRatioByVar, cutRatioByMean = cutRatioByMean)
prostate6lapply(prostate6, names)
as.data.frame(lapply(prostate6$data, dim))
## $data
## [1] "Welsh" "Yu" "Lapointe" "Varambally" "Singh"
## [6] "Wallace"
##
## $dataLabel
## [1] "Welsh" "Yu" "Lapointe" "Varambally" "Singh"
## [6] "Wallace"
##
## Welsh Yu Lapointe Varambally Singh Wallace
## 1 3399 3399 3399 3399 3399 3399
## 2 34 146 103 13 102 89
=prostate6$data
DList=prostate6$dataLabel
colLabel#GList=pathway[[1]]
#GList=pathwayDatabase
=TRUE
filterGenes=0.3 #
cutRatioByMean=0.3
cutRatioByVar=MetaQC(DList, colLabel, GList=c(Hallmark.genesets, KEGG.genesets, Immunologic.genesets), filterGenes, cutRatioByMean, cutRatioByVar) QCresult2
AGAIN, if you are unable to run the code, please load
load("results/QC_step2.rda")
$scoreTable QCresult2
## IQC EQC AQCg AQCp CQCg CQCp
## Welsh 4.3956311 3.9486143 38.42833030 0 1.913178e+02 4.028780e-02
## Yu 7.0164906 6.8265364 42.25733812 0 1.964917e+02 1.920686e-14
## Lapointe 1.5414898 1.7423903 16.70625993 0 9.884958e+01 0.000000e+00
## Varambally 3.5800781 2.9649972 7.18472473 0 2.391464e+01 0.000000e+00
## Singh 0.4912204 0.3180309 12.36863835 0 4.555982e+01 0.000000e+00
## Wallace 6.4044351 6.7765190 0.02200338 0 2.785632e-03 0.000000e+00
plotMetaQC(scoreTable = QCresult2$scoreTable)
We have listed the methods for differential expression meta analyses in the introduction section that are implemented in the MetaDE package. Here we will try the best performing methods including AW Fischer, REM and rOP. A detailed comparison of relative performances of each of the statistical methods is descibed Lun-Ching etal.
<- prostate6$data # Extract the expression matrices
data <- prostate6$dataLabel # extract lables for each sample
clin.data
<- length(data)
K
<- lapply(clin.data, function(x) {data.frame(x)} )
clin.data
for (k in 1:length(clin.data)){
colnames(clin.data[[k]]) <- "label"
<- (ifelse(clin.data[[k]]==0, "Control", "Cancer"))
clin.data[[k]]
}
#clin.data <- lapply(clin.data, function(x) {data.frame(x)} )
<- c("Control", "Cancer")
select.group <- "Control"
ref.level <- "continuous"
data.type <- rep('limma',length(data))
ind.method <- "twoclass"
resp.type <- rep(FALSE,length(data))
paired
<- "Fisher"
meta.method
<- MetaDE(data=data,clin.data = clin.data,
meta.res data.type=data.type,resp.type = resp.type,
response='label',
ind.method=ind.method, meta.method=meta.method,
select.group = select.group, ref.level=ref.level,
paired=paired,tail='abs',parametric=TRUE)
<- MetaDE::summary.meta(meta.res, resp.type = resp.type, meta.method = meta.method)
meta.res.summary head(meta.res.summary)
# save(meta.res, meta.res.summary, file = "./results/resMetaDE.rda")
## Please make sure the following is correct:
## *You input 6 studies
## *You selected limma limma limma limma limma limma for your 6 studies respectively
## * Fisher was chosen to combine the 6 studies,respectively
## dataset 1 is done
## dataset 2 is done
## dataset 3 is done
## dataset 4 is done
## dataset 5 is done
## dataset 6 is done
## Parametric method was used instead of permutation
## ind.stat.Welsh ind.stat.Yu ind.stat.Lapointe ind.stat.Varambally
## GPR12 -0.01447604 0.37943628 0.28513849 0.1086048
## RPS19 0.20673665 0.23162454 0.77668332 0.1880854
## GALNT2 -0.11916266 0.11168920 0.36585563 0.3870752
## MSI1 0.29758865 0.02367612 0.37455130 0.6647996
## FCGRT -0.36240927 -0.10877216 -0.53210661 -0.3114009
## CD163 0.15648617 0.08638936 0.08651916 1.2875034
## ind.stat.Singh ind.stat.Wallace ind.p.Welsh ind.p.Yu ind.p.Lapointe
## GPR12 -0.106957489 -0.3301548 0.978695463 0.006504890 2.762384e-01
## RPS19 1.026749249 -0.4766147 0.016116661 0.002484948 3.768341e-10
## GALNT2 0.005348824 0.2825572 0.428375201 0.437773064 8.001044e-02
## MSI1 -0.180755089 -0.2662394 0.669863110 0.860595368 2.907234e-03
## FCGRT 0.022247198 0.1003269 0.000154019 0.155941050 4.359881e-06
## CD163 -0.211345323 -0.2461299 0.464836081 0.624141614 6.057858e-01
## ind.p.Varambally ind.p.Singh ind.p.Wallace stat pval
## GPR12 0.81888709 0.0354446198 5.153568e-02 25.69660 1.184580e-02
## RPS19 0.14991966 0.0005284280 6.684494e-06 106.36720 3.121382e-17
## GALNT2 0.05198725 0.9191733351 2.091679e-02 22.21530 3.517649e-02
## MSI1 0.26369912 0.0002454798 3.765074e-01 34.02685 6.679961e-04
## FCGRT 0.13151895 0.8052767733 4.739980e-01 51.94301 6.345524e-07
## CD163 0.04866036 0.0040981743 2.567059e-01 23.23721 2.577828e-02
## FDR
## GPR12 1.563039e-02
## RPS19 5.358373e-16
## GALNT2 4.258009e-02
## MSI1 1.116283e-03
## FCGRT 1.846613e-06
## CD163 3.195602e-02
If you were unable to run the MetaDE analyses, you can load the results of the above commands to review the output by usingh
load("./results/resMetaDE.rda")
<- 1e-7
fdr.cut pdf('./results/heatmap.DE.pdf')
heatmap.sig.genes(meta.res, meta.method=meta.method,
fdr.cut=fdr.cut,color="GR")
dev.off()
At this stage, one can do pathway analyses directly on the genes that appear to be differentially regulated based on the meta analyses performed above.
<- meta.res$meta.analysis$pval
meta.p <- PathAnalysis(meta.p = meta.p, enrichment = "KS")
ks.result <- PathAnalysis(meta.p = meta.p, enrichment = "Fisher's exact")
fisher.result
## One can customize pathway datbase to update the results like following
load(file = "./data/pathwayDatabase.rda")
## Let us look into the database
<- pathwayDatabase %>% names() %>% stringr::str_split(pattern = "_", simplify = TRUE) %>% {.[,1]} %>% unique()
pathway_names print(pathway_names)
<- MetaDE::PathAnalysis(meta.p ,pathway = pathwayDatabase, p.cut = 0.05,
path.res enrichment = "Fisher's exact", DEgene.number = 400, size.min = 10, size.max = 500)
The pathway analyses performed here is based on the p-values obtained after the joint meta analyses of all the studies in your data. Some time individual study level pathway analyses is reported and we are interested in combining pathway level summary statistics. We will explore that in our next session where we will perform the analyses using “MetaPath” package.
Perform DE meta analyses using any two methods among “AW-Fisher”,
“FEM,”REM and “minMCC” and compare the number of differentially
expressed genes detected at FDR <0.05. Take a look at the
?MetaDE::MetaD
function to change the parameters.
Perform pathway analyses based one of the methods you tried and see whether any differences at pathway level were detected.
<- "AW"
meta.method <- MetaDE(data=data,clin.data = clin.data,
meta.res data.type=data.type,resp.type = resp.type,
response='label',covariate = NULL,
ind.method=ind.method, meta.method=meta.method,
select.group = select.group, ref.level=ref.level,
paired=paired, rth=NULL,
REM.type=NULL,tail='abs',parametric=TRUE)
## Please make sure the following is correct:
## *You input 6 studies
## *You selected limma limma limma limma limma limma for your 6 studies respectively
## * AW was chosen to combine the 6 studies,respectively
## dataset 1 is done
## dataset 2 is done
## dataset 3 is done
## dataset 4 is done
## dataset 5 is done
## dataset 6 is done
## Parametric method was used instead of permutation
<- "FEM"
meta.method <- MetaDE(data=data,clin.data = clin.data,
meta.res data.type=data.type,resp.type = resp.type,
response='label',
ind.method=ind.method, meta.method=meta.method,
select.group = select.group, ref.level=ref.level,
paired=paired, tail='abs')
## Please make sure the following is correct:
## *You input 6 studies
## * FEM was chosen to combine the 6 studies,respectively
<- "REM"
meta.method <- "HO"
REM.type <- MetaDE(data=data,clin.data = clin.data,
meta.res data.type=data.type,resp.type = resp.type,
response='label',
ind.method=ind.method, meta.method=meta.method,
select.group = select.group, ref.level=ref.level,
paired=paired,
REM.type=REM.type,tail='abs')
## Please make sure the following is correct:
## *You input 6 studies
## * REM was chosen to combine the 6 studies,respectively
<- 'minMCC'
meta.method <- MetaDE(data=data,clin.data = clin.data,
meta.res data.type=data.type,resp.type = resp.type,
response='label',
ind.method=ind.method, meta.method=meta.method,
select.group = select.group, ref.level=ref.level,
paired=paired,tail='abs',parametric=FALSE,nperm=100)
## Please make sure the following is correct:
## *You input 6 studies
## * minMCC was chosen to combine the 6 studies,respectively
Pathway analysis (a.k.a. gene set analysis) is a statistical tool to infer correlation of differential expression evidence in the data with pathway knowledge from established databases. The idea behind pathway analysis is to determine if there is enrichment in the detected DE genes based on an a priori defined biological category. Such a category might come from one or multiple databases such as Gene Ontology (GO; www.geneontology.org), the Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.genome.jp/kegg/), Biocarta Pathways (http://www.biocarta.com/) and the comprehensive Molecular Signatures Database (MSigDB; http://www.broadinstitute.org/gsea/msigdb/). For the majority of recent microarray meta-analysis applications, pathway analysis has been a standard follow-up to identify pathways associated with detected DE genes e.g. and many others]. The result provides more insightful biological interpretation and it has been reported that pathway analysis results are usually more consistent and reproducible across studies than DE gene detection. Shen and Tseng developed a systematic framework of Meta-Analysis for Pathway Enrichment (MAPE) by combining information at gene level, at pathway level and a hybrid of the two. We will use this package for systematic analyses of microarray data from different studies.
This is the first major function in the MetaPath2.0 package which combines the Meta-analysis for Pathway Enrichment (MAPE) methods introduced by Shen and Tseng (2010) and the Comparative Pathway Integrator (CPI) method introduced by Fang and Tseng (2016). The default function is CPI which performs MAPE_P (integrating multiple studies at pathway level) with Adaptively Weighted Fisher’s method as Meta-analysis statistics.
#load("./data/pathways.rda")
= MAPE2.0(arraydata = data , clinical.data = clin.data,
CPI_result label = "label",pmtx = NULL,pathway = c(Biocarta.genesets, GOBP.genesets,
data.type ="discrete",
GOCC.genesets, GOMF.genesets, KEGG.genesets, Reactome.genesets), resp.type = "twoclass",method = "CPI", ind.method = rep("limma",length(data)),
paired =rep(FALSE,length(data)),select.group=select.group, ref.level=ref.level ,
tail="abs", enrichment = "Fisher's exact", DEgene.number = 400,stat = "AW Fisher")
save(CPI_result, file = "./results/MetaPEResults/MAPE_p_pathways.rds")
set.seed(15213)
= MAPE.Kappa(summary = CPI_result$summary,pathway = CPI_result$pathway,
CPI.kappa_result max_k = 10, q_cutoff = 0.05,software = CPI_result$method, output_dir = "./results/MetaPEResults/")
data(hashtb)
#data("./data/hashtb.rda")
#xx <- data(hashtb)
set.seed(1)
<- MAPE.Clustering(summary=CPI_result$summary,Num_Clusters = 6,
MAPE.Clustering.results kappa.result = CPI.kappa_result$kappa,sil_cut=0.01,
Num_of_gene_lists=CPI_result$Num_of_gene_lists,genelist =CPI_result$genelist,
pathway=CPI_result$pathway, enrichment=CPI_result$enrichment,
method=CPI.kappa_result$method,software=CPI_result$method, n.text.permute = 1000, output_dir = "./results/MetaPEResults/" )
You can take a look at the “Clustering_Summary.csv” to gen an idea of reproducibility of pathways and the amount of evidence provided by each study.
Of the heat
Co-expression analysis and network analysis of microarray data are used to investigate potential transcriptional co-regulation and gene interactions. Network analyses typically work with the gene–gene co-expression matrix, which represents the correlation between each pair of genes in the study. A crucial assumption is that the magnitude of the co-expression between any pair of genes is associated with a greater likelihood that the two genes interact. Thus, networks of interactions between genes are inferred from the co-expression matrix.
Here we will use MetaDCN to extract network modules in disease vs control conditions.
#data("./data/pathwayDatabase.rda")
#data(example)
# Generate correction and adjacency matrices for data and permutation
# GeneNet returns a list of information which will be used for SearchBM and MetaDCN function, and several RData files stored in folder path.
# AdjacencyMatrices.RData is a list of adjacency matrices for case and control in each study in the order of case studies and control studies.
#
# CorrelationMatrices.RData is a list of correlation matrices for case and control in each study.
#
# AdjacencyMatricesPermutationP.RData is a list of correlation matrices for case and control in each study in permutation P.
.2 <- GeneNet(data , clin.data, caseName="Cancer", controlName="Control", meanFilter=0.8, SDFilter=0.8, edgeCutoff=0.1, permutationTimes=4, CPUNumbers=4, pathwayDatabase=c(Biocarta.genesets,GOBP.genesets,GOCC.genesets,GOMF.genesets,
GeneNetRessilent=FALSE, folder = "./results/MetaDCNResults/") KEGG.genesets,Reactome.genesets),
# This function will search for basic modules differentially co-expressed between case and control
# SearchBM will return a list and several Rdata, csv and png files saved in the folder path specified in GeneNet inputs.List of basic module information:
# w1
# w1 weight with the most basic modules detected
#
# BMInCase
# data matrix listing the information of basic modules higher correlated in case
#
# BMInControl
# data matrix listing the information of basic modules higher correlated in control
#
# permutation_energy_direction_p.Rdata is a list of energies of basic modules from permutation p.
#
# basic_modules_summary_direction_weight_w.csv is a summary of basic modules detected using weight w in forward/backward search.
#
# threshold_direction.csv is a table listing number of basic modules detected under different FDRs in forward/backward search.
#
# figure_basic_module_c_repeat_r_direction_weight_w.png is a plot of basic module from component c repeat r using weight w in forward/backward search.
<- SearchBM(GeneNetRes.2, MCSteps=500, jaccardCutoff=0.7, repeatTimes=5, outputFigure=TRUE, silent=FALSE ) SearchBMRes
#This function will assemble basic modules detected from SearchBM into supermodules.
# w1:w1 used
#
# BMInCaseSig: Summary of basic modules higher correlated in Case controling FDR
#
# BMInControlSig
# Summary of basic modules higher correlated in Control controling FDR
#
# Supermodule: Summary of supermodules
#A number chosen from (100, 200, ..., 700) to specify the weight1 used in objective function (optional). If not specified, w1 from SearchBM function will be used (recommended).
<- MetaDCN(GeneNetRes.2, SearchBMRes, FDRCutoff=0.05, w1=NULL, silent=FALSE) MetaDNCRes
You get the modules, their pathway annotations and cytoscape files in the results directory.
For the final assembled modules have a look at the “module_assembly_summary_weight_500.csv” and find out ho many “supe modules you are able to identity on q value cut off of 0.05”
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid parallel stats4 stats graphics grDevices datasets
## [8] utils methods base
##
## other attached packages:
## [1] MetaDCN_0.0-1 igraph_1.2.6
## [3] snowfall_1.84-6.1 snow_0.4-3
## [5] MetaPath_2.0 shape_1.4.6
## [7] Rgraphviz_2.36.0 ggplot2_3.3.5
## [9] gplots_3.1.1 cluster_2.1.2
## [11] irr_0.84.1 lpSolve_5.6.15
## [13] ConsensusClusterPlus_1.56.0 impute_1.66.0
## [15] genefilter_1.74.0 GSEABase_1.54.0
## [17] graph_1.70.0 annotate_1.70.0
## [19] XML_3.99-0.6 MetaDE_2.2.3
## [21] MetaQC_1.0 multtest_2.48.0
## [23] GSA_1.03.1 preproc_1.2-4
## [25] DMwR_0.4.1 lattice_0.20-44
## [27] AnnotationDbi_1.54.1 IRanges_2.26.0
## [29] S4Vectors_0.30.0 Biobase_2.52.0
## [31] BiocGenerics_0.38.0 plyr_1.8.6
## [33] magrittr_2.0.1 captioner_2.2.3
## [35] bookdown_0.22 knitr_1.33
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-2 ellipsis_0.3.2
## [3] class_7.3-19 XVector_0.32.0
## [5] GenomicRanges_1.44.0 fs_1.5.0
## [7] bit64_4.0.5 fansi_0.5.0
## [9] splines_4.1.0 cachem_1.0.5
## [11] geneplotter_1.70.0 jsonlite_1.7.2
## [13] png_0.1-7 shiny_1.6.0
## [15] compiler_4.1.0 httr_1.4.2
## [17] Matrix_1.3-4 fastmap_1.1.0
## [19] limma_3.48.1 later_1.2.0
## [21] htmltools_0.5.1.1 tools_4.1.0
## [23] gtable_0.3.0 glue_1.4.2
## [25] GenomeInfoDbData_1.2.6 Rcpp_1.0.7
## [27] jquerylib_0.1.4 vctrs_0.3.8
## [29] Biostrings_2.60.1 xfun_0.24
## [31] stringr_1.4.0 openxlsx_4.2.4
## [33] mime_0.11 lifecycle_1.0.0
## [35] renv_0.13.2 gtools_3.9.2
## [37] edgeR_3.34.0 zlibbioc_1.38.0
## [39] MASS_7.3-54 zoo_1.8-9
## [41] scales_1.1.1 promises_1.2.0.1
## [43] MatrixGenerics_1.4.0 SummarizedExperiment_1.22.0
## [45] RColorBrewer_1.1-2 yaml_2.2.1
## [47] quantmod_0.4.18 curl_4.3.2
## [49] memoise_2.0.0 sass_0.4.0
## [51] rpart_4.1-15 stringi_1.7.3
## [53] RSQLite_2.2.7 highr_0.9
## [55] TTR_0.24.2 caTools_1.18.2
## [57] zip_2.2.0 BiocParallel_1.26.1
## [59] GenomeInfoDb_1.28.1 rlang_0.4.11
## [61] pkgconfig_2.0.3 matrixStats_0.59.0
## [63] bitops_1.0-7 evaluate_0.14
## [65] ROCR_1.0-11 bit_4.0.4
## [67] samr_3.0 DESeq2_1.32.0
## [69] R6_2.5.0 combinat_0.0-8
## [71] DelayedArray_0.18.0 DBI_1.1.1
## [73] withr_2.4.2 pillar_1.6.1
## [75] xts_0.12.1 survival_3.2-11
## [77] KEGGREST_1.32.0 abind_1.4-5
## [79] RCurl_1.98-1.3 tibble_3.1.2
## [81] crayon_1.4.1 shinyFiles_0.9.0
## [83] KernSmooth_2.23-20 utf8_1.2.1
## [85] rmarkdown_2.9 locfit_1.5-9.4
## [87] blob_1.2.1 digest_0.6.27
## [89] xtable_1.8-4 httpuv_1.6.1
## [91] munsell_0.5.0 bslib_0.2.5.1
End of document