Corticogenesis is the dynamic process that results in the formation of the cerebral cortex, and is characterized by the generation of excitatory glutamatergic neurons from cortical progenitors, and the differentiation of astrocytes and oligodendrocytes. Dynamic changes in the activity of cis-regulatory DNA elements underlie the complex phenotypic transformations that occur during development.
Here we will be analyzing Human fetal brain cortex data from Trevino et al. 2021 (source) (OA preprint) to study the interplay between chromatin accessibility and gene expression in early corticogenesis.
The main goal will be to identify non-coding genomic regions where chromatin accessibility is associated with expression of genes involved in excitatory neuron development.
With AWS:
TBA
clone the course code repository locally
git clone https://github.com/NBISweden/single-cell_sib_scilifelab_2021.git
Without AWS: if you prefer you can set up your own working environment locally
Download the preprocessed data from GDrive
clone the course code repository locally
git clone https://github.com/NBISweden/single-cell_sib_scilifelab_2021.git
sc2021-multiomics
(if needed install miniconda first):cd single-cell_sib_scilifelab_2021/project_omics
conda env create --file multiomics-environment.yml
conda activate sc2021-multiomics
You may have to repeat the activation after a new login or after deactivating the environment with conda deactivate
.
(sc2021-multiomics)
at the beginning of the command line prompt):jupyter notebook ./multiomics_unmatched.ipynb
We will have a warm-up session to start exploring our datasets before the project starts.
You can start familiarizing yourself with some of the tools we will be using, trying out the examples in vignettes:
Corticogenesis is the dynamic process that results in the formation of the cerebral cortex, and is characterized by the generation of excitatory glutamatergic neurons from cortical progenitors, and the differentiation of astrocytes and oligodendrocytes. Dynamic changes in the activity of cis-regulatory DNA elements underlie the complex phenotypic transformations that occur during development.
Here we will be analyzing Human fetal brain cortex data from Trevino et al. 2021 (source) (OA preprint) to study the interplay between chromatin accessibility and gene expression in early corticogenesis.
You will find the preprocessed datasets in the /data/multiomics/
directory on the AWS server (or alternatively, on Google Drive)
gr1_unmatched_diagonal
contains data from unmatched scRNA-seq (19373 cells x 33197 genes) and scATAC-seq (6423 cells x 657930 peaks) assays.gr2_matched_vertical
contains data from matched scRNA-seq (8981 cells x 34104 genes) and scATAC-seq (8981 cells x 467315 peaks) assays.You will find the same datasets saved both in anndata format for use in python (*.h5ad
) and in SingleCellExperiment format for use in R (*.RDS
).
For each scATAC dataset we have also provide precomputed "gene activities", counting ATAC fragments over gene bodies and promoters, as implemented by the Signac
function GeneActivity
(authors of this dataset did not share raw data because of patient privacy).
In the template notebooks we demonstrate how to preprocess and merge the single modality objects in MuData objects from the python package muon
.
Your main goal will be to identify non-coding genomic regions where chromatin accessibility is associated with expression of genes involved in excitatory neuron development.
In the project folder, you will find a template Jupyter Notebook guiding you through the steps for the integration project:
multiomics_unmatched.ipynb
multiomics_matched.ipynb
Because we will need to use both tools in R and in python, we provide an additional notebook illustrating how to use R code in jupyter environment using the RPy2 framework - rpy2_interoperability_examples.ipynb
).