Single-cell Visualizations

Workshop on Advanced Data Visualization

Author

Lokesh Mano

Published

05-May-2025

The visualization tutorila here is inspired from the NBIS workshop on Single-cell RNA-seq Analysis. The visualization tutorial her eare part of the single-cell workshop as well.

1 Get data

In this tutorial, we will run all tutorials with a set of 8 PBMC 10x datasets from 4 covid-19 patients and 4 healthy controls, the samples have been subsampled to 1500 cells per sample. We can start by defining our paths.

path_covid <- "./data/covid/"
if (!dir.exists(path_covid)) dir.create(path_covid, recursive = T)

download.file("https://lu.box.com/shared/static/205zwfegtinm49yuelbvxawln6r3wyjd", destfile = file.path(path_covid, "seurat_covid_raw.rds"))
download.file("https://lu.box.com/shared/static/nf5eps2uus9qcjx0ml0n313p9a8r1vb7", destfile = file.path(path_covid, "seurat_covid_qc_dr_clst.rds"))

# If downloading via R terminal fails, you can download the images directly using the following links:
# seurat_covid_raw.rds: https://lu.box.com/s/205zwfegtinm49yuelbvxawln6r3wyjd
# seurat_covid_qc_dr_clst.rds: https://lu.box.com/s/nf5eps2uus9qcjx0ml0n313p9a8r1vb7

With data in place, now we can start loading libraries we will use in this tutorial.

suppressPackageStartupMessages({
    library(Seurat)
    library(Matrix)
    library(ggplot2)
    library(patchwork)
})
seurat_covid_raw <- readRDS("./data/covid/seurat_covid_raw.rds")

All the eight different datasets mentioned above are already merged into a single Seurat object. The two files you have downloaded are explained below:

  • seurat_covid_raw.rds: Contains all the data in raw format. There was no pre-processing of the data done here.
  • seurat_covid_qc_dr_clst.rds: In this file, the cells and the genes went through a different QC proces, followed by Dimentionality Reduction and Clustering.

2 Raw data

Let us just take a look at the raw-data to satrt with. Here is how to take a look at the count matrix and the metadata for every cell.

# rna counts matrix
covid_raw[["RNA"]]$counts[1:10, 1:4] 

# metadat of the cells
head(covid_raw@meta.data, 10)

When you look at the metadata you can see the rownames are ids of each cell. Each of the column is explained below.

  • orig.ident: represents the patient id.
  • nCount_RNA: the total number of molecules detected within a cell
  • nFeature_RNA: the number of genes detected in each cell
  • type: If the sample is covid or control
  • percent_mito: Mitochondrial gene content in percent
  • percent_ribo: Ribosomal gene content in percent
  • percent_hb: Percentage hemoglobin genes
  • percent_plat: Percentage for some platelet markers

Thee values are pre-calculated to help assist with the visualization.

2.1 Plot QC

To plot the raw QC of the datset, we will use the function VlnPlot(). This is a function from the Seurat package and it basically generates a ggplot2 object from the Seurat object. Violin plots are a different way of looking at boxplot.

feats <- c("nFeature_RNA", "nCount_RNA", "percent_mito", "percent_ribo", "percent_hb", "percent_plat")
VlnPlot(covid_raw, group.by = "orig.ident", split.by = "type", features = feats, pt.size = 0.1, ncol = 3)