SNF lab

Task (45 minutes)

Important

Data: MOFA's CLL dataset

  1. Load and prepare data, then compute the affinity matrices and perform SNF.
  2. Assume that there are two cancer subtypes and cluster them :)
  3. Plot the fused network and the clusters using networkx, or Gephi, or igraph, whatever else you prefer.

Optional:

Code:

After loading the files and doing the required transformations to the data (check the NMF lab), I finally reached a point where I have each dataset in a matrix, X1 and X2, respectively mRNA and methylation. Now computing the affinity matrices..

Applying the SNF method, with a k parameter of 20 (check the course slides for what this means). Then I extract the estimated number of clusters using spectral clustering.

You can now compare the estimated clusters from the fused matrix with the clustering done on the initial similarity matrices. As well as computing other indicators such as NMI and Silhouette scores.

Downstream analysis

Since we are dealing with network models, we cannot rely on weights and scores here, but we need to compute network centralities to find the samples driving the signal in one of the clusters.

Further task: