Preparation for the tutorial


This workshop will comprise both lectures and hands-on exercises. While you will be able to follow all exercises from the html files, we recommend that you prepare by 1. familiarizing yourself with basic R and Python, 2. installing conda, and 3. going through the following 3 notebooks. If you want to run them in your system please see the labs page for setup instructions.

If you are interested you can go through the additional reading materials.

Programming with R and Python


The course will be taught using both R and Python depending on the tools available. While you will be able to follow all lectures and exercises conceptually, it is helpful if you are familiar with basic usage of both programming languages:

You should also be familiar with basic command line input (mkdir, cd, ls, cp, mv).

Conda Instructions


During this workshop, you will use conda environments to run the exercises. This is because conda environments allow all users to have the same computing environment, i.e. package versions. This enforces reproducibility for you to run this material without the need to re-install or change your local versions. See and graphical example below:

Conda environments are a self-contained directory that you can use in order to reproduce all your results.

Briefly, you need to:

1. Install Conda and Mamba
2. Setup conda 3. Install git and clone the repository
4. Create and activate the environment
5. Launch RStudio or Jupyter
6. Deactivate the environment after running your analyses
7. Environment list

You can read more about Conda environments and other important concepts to help you make your research reproducible.


1. Download and install Conda and Mamba

Start by installing Conda. We suggest installing Miniconda3 and NOT Anaconda. After installing Conda.

On Mac OS X

First, make sure you have Xcode and CommandLineTools installed and updated to latest version (in AppStore). If you have not already installed CommadLineTools, go to a terminal window and run:

  xcode-select --install

First download the latest version of Miniconda3 and run it to install.

  curl -o Miniconda3-latest-MacOSX-x86_64.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
  sh Miniconda3-latest-MacOSX-x86_64.sh

Follow the instructions on screen, scrolling down, pressing ENTER and replying yes when necessary. Install it in the default directory. Restart your terminal window to apply modifications. After restarting, you can type the command below to install Mamba:

  conda init
  conda install -n base -c conda-forge mamba
On Ubuntu

First download the latest version of Miniconda3 and run it to install.

  wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  sh Miniconda3-latest-Linux-x86_64.sh

Follow the instructions on screen replying yes when necessary. Restart your terminal window to apply modifications. After restarting, you can type the command below to install Mamba:

  conda init
  conda install -n base -c conda-forge mamba
On Windows 10

Unfortunately, not all packages available on conda are compatible with windows machines. The good news is that Windows 10 offers native linux support via the Windows Subsystem for Linux (WSL2). This allows you to run linux/bash commands from within windows without the need of a virtual machine nor a dual-boot setup (i.e. having 2 operating systems). However, WSL does not offer a complete support for graphical interfaces (such as RStudio in our case), so we need additional steps to make that happen.

  1. On Windows 10, install the WSL if you don’t have it. Follow the instructions here: https://docs.microsoft.com/en-us/windows/wsl/install-win10

  2. Once you have that installed, you can download and install MobaXterm (which is the enhanced terminal with graphical capacity): https://mobaxterm.mobatek.net
    It is recommended that you INSTALL the program and not use the portable version.

  3. Inside MobaXterm, you will probably will see that your WSL is already listed on the left panel as an available connection. Just double-click it and you will be accessing it via MobaXterm. If by any chance you don’t see it there, close MobaXterm and go to the WSL terminal, because probably the WSL is not allowing SSH connections. You can follow this link for the instructions on how to do it. You need to complete until the step Start or restart the SSH service, while the further steps are optional, but might be useful.

  4. Inside MobaXterm, download Conda with the command:

  wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  1. Inside MobaXterm, type the commands below to install Conda. Follow the instructions for the installation there.
  cd ~/Downloads
  sh Miniconda3-latest-Linux-x86_64.sh
  1. Inside MobaXterm, Follow the instructions on screen replying yes when necessary. Restart your terminal window to apply modifications. After restarting, you can type the command below to install Mamba:
  conda init
  conda install -n base -c conda-forge mamba
  1. Inside MobaXterm, type the commands below to install the X-server graphical packages that will be used to launch RStudio. https://docs.anaconda.com/anaconda/install/linux/
  sudo apt-get update
  sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
  1. Close and open all application and Inside MobaXterm, you will probably will see that your WSL is already listed on the left panel as an available connection. Just double-click it and you will be accessing it via MobaXterm.
On VirtualBox

If by any means you see that the installations are not working as it should on your computer, you can try to create a virtual machine to run UBUNTU and install everything there. But please keep this alternative as the last temporary resourse, as we recommend troubleshooting the installation o the up-mentioned methods.

  1. Download and install on your machine VIRTUALBOX https://www.virtualbox.org

  2. Download the ISO disk of UBUNTU https://ubuntu.com/download/desktop

  3. On VIRTUALBOX, click on Settings (yellow engine) > General > Advanced and make sure that both settings Shared Clipboard and Drag’n’Drop are set to Bidirectional.

  4. Completely close VIRTUALBOX and start it again to apply changes.

  5. On VIRTUALBOX, create a machine called Ubuntu and add the image above
    • set the memory to the maximum allowed in the GREEN bar
    • set the hard disk to be dynamic allocated
    • all other things can be default
  6. Proceed with the Ubuntu installation as recommended. You can set to do “Minimal Installation” and deactivate to get updates during installation.

  7. Inside Ubuntu, open TERMINAL and type the commands below to install the X-server graphical packages that will be used to launch RStudio. https://docs.anaconda.com/anaconda/install/linux/
  sudo apt-get update
  sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
  1. Inside UBUNTU, Download conda:
  wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  1. Inside UBUNTU, open the TERMINAL and type the commands below. Follow the instructions for the installation there.
  cd ~/Downloads
  sh Miniconda3-latest-Linux-x86_64.sh
  1. Close Terminal to apply the CONDA updates.


2. Setup conda

Set channel priority

conda config --set channel_priority false

and remove packages and cache

conda clean -a -y
3. Install git clone the repository

Throughout the course we’ll be using scripts and environments found in our github repository. After installing mamba, install git and clone the repository:

mamba install -c anaconda git #install
mkdir ~/Desktop/course #create the folder
cd ~/Desktop/course #change directory
git clone https://github.com/NBISweden/workshop_omicsint_ISMBECCB.git . #clone the repository

All environments are contained inside the folder /environments/

4. Create and activate the environments

You will need to create and use different conda environments. Please note the different environments if you are on MacOS or Linux:

MacOS

#data pre-processing
mamba env create -f environments/env-preprocessing.yaml -n ismb_prep

#supervised integration and feature selection
mamba env create -f environments/env-ml.yaml -n ismb_si_fs   

#dimensionality reduction, unsupervised integration, network analysis
mamba env create -f environments/env-ml_nets.yaml -n ismb_dr_ui_na 

##################################################################
# The network meta-analysis environment is created from within R #
##################################################################


Linux

#data pre-processing
mamba env create -f environments/env-preprocessing_linux.yaml -n ismb_prep

#supervised integration and feature selection
mamba env create -f environments/env-ml_linux.yaml -n ismb_si_fs   

#dimensionality reduction, unsupervised integration, network analysis
mamba env create -f environments/env-ml_nets_linux.yaml -n ismb_dr_ui_na 

##################################################################
# The network meta-analysis environment is created from within R #
##################################################################

To activate the environments use conda activate [environment name]. For instance

conda activate ismb_si_fs

5. Launch RStudio or Jupyter

Depending on the exercise, you’ll have to run scripts in either RStudio or Jupyter. You can launch these with

rstudio &

or

jupyter-notebook &

6. Deactivate the environment after running your analyses

After you’ve ran all your analyses, you can deactivate the environment by typing:

conda deactivate

7. environment list

If you cannot setup and run these notebooks, each directory also contains the respective HTML files to assist you.
Please refer to the next list of notebooks and environments:

Topic notebook type path to notebook HTML file container name
Data pre-processing jupyter /session_preparation/data_preparation/preprocessing.ipynb html ismb_prep
Dimensionality reduction Rmd /session_preparation/dimreduction/OmicsIntegration_DimensionReduction.Rmd html ismb_dr_ui_na
Feature selection Rmd /session_preparation/feature_selection/OmicsIntegration_FeatureSelection.Rmd html ismb_si_fs
Supervised Integration Rmd /session_ml/SupervisedOMICsIntegration/supervised_omics_integr_CLL.Rmd html ismb_si_fs
Unsupervised Integration Rmd /session_ml/UnsupervisedOMICsIntegration/UnsupervisedOMICsIntegration.Rmd html ismb_dr_ui_na
Network analysis jupyter /session_topology/lab.ipynb html ismb_dr_ui_na
Network meta analysis Rmd /session_meta/lab_meta-analayses-v2.Rmd html to create from within R