conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
Compute environment
UPPMAX
To run exercises on UPPMAX you need an account. You can apply for an account here.
We will primarily be using Uppsala’s high-performance computing (HPC) center UPPMAX to run exercises. Course material will be hosted in a dedicated course project directory /proj/naiss2023-22-1084
.
We recommend you setup a working directory based on your username in /proj/naiss2023-22-1084/users
in which to run your exercises:
mkdir -p /proj/naiss2023-22-1084/users/YOURUSERNAME
cd /proj/naiss2023-22-1084/users/YOURUSERNAME
All computations should be run on a compute node. You can request an interactive session with the interactive
command. For example, to request an eight hour job on 4 cores, run
interactive -A naiss2023-22-1084 -n 4 \
--time 08:00:00 \
--reservation=naiss2023-22-1084_#
where #
is a number that corresponds to the day of the week, starting from 1 (Monday=1, Tuesday=2, and so on).
We have priviliged access to a limited number of nodes. Please do not book more than 4 cores or else your fellow students will experience long waiting times.
Tutorials
UPPMAX hosts tutorials and user guides at https://www.uppmax.uu.se/support/user-guides/. In particular, https://www.uppmax.uu.se/support/user-guides/guide--first-login-to-uppmax/ has information on how to connect to and work on UPPMAX.
Jupyter Notebooks
Jupyter Notebook exercises will be run in local compute environments on your laptop. See the section below on setting up a pgip
conda environment, which by default installs jupyter
and its dependencies.
JupyterLite
There are some Jupyter Notebook exercises that are hosted online and run using JupyterLite which is a JupyterLab distribution that runs entirely in the browser. Apart from having a browser, no preparations are necessary. Note that some users have reported issues with Firefox and that Google Chrome may be a better solution.
Conda
Exercises that require local software installation will make use of the conda package manager to install necessary requirements from the package repositories bioconda and conda-forge. This is also the fallback solution in case there are issues with the HPC.
1. Install conda
To start using conda, follow the quick command line install instructions to install the minimal conda installer miniconda.
2. Configure conda
Configure conda to access the package repositories (see also bioconda usage). This will modify your ~/.condarc
file:
Please note that the order of these commands is important! When conda config --add
is run it adds the channel to the top of the list in your configuration, so your ~/.condarc
will end up looking like this:
cat ~/.condarc
channels:
- conda-forge
- bioconda
- defaults
channel_priority: strict
3. Create an isolated course environment
It is suggested you create and change to a isolated environment pgip
dedicated to the course. The command below will create an environment named pgip
and install the packages python
version 3.10, an R base installation (r-base
), the jupyter
package that provides support for Jupyter Notebooks, an R
kernel for Jupyter, and the mamba
package manager.
conda create --name pgip python=3.10 r-base jupyter r-irkernel mamba
conda activate pgip
The activate
command is required to access the isolated environment named pgip
. Once you have activated the environment, you gain access to whatever programs are installed. To deactivate an environment you issue the command conda deactivate
.
4. Install packages
Installation of packages in an environment is done with the install
command, but we recommend you use the mamba package manager as it is faster (mamba
is a rewrite of conda
in C++). An example of how to install packages bcftools, angsd, mosdepth
follows (remember to activate pgip
!):
Some users have reported errors in that bcftools
and angsd
cannot be found, despite setting the proper channels. We are looking into the issue, but unless there are issues with UPPMAX, we will not need to install any additional packages apart from those that went into the creation of the pgip
environment above. You can therefore treat the code below as examples only.
conda activate pgip
mamba install bcftools angsd mosdepth
or if you have packages listed in an environment file
#| label: conda-install-packages-from-environment-file
#| echo: true
#| eval: false
mamba env update -f environment.yml
Tools
Computer exercise requirements are listed in Tools
callout blocks in each exercise. The Tools
callout block contains listings of programs, along with package dependencies and specifications for UPPMAX and conda, whenever relevant. An example block is shown below.
Example Tools block.
Provides command and instructions to load relevant UPPMAX modules.
Example:
module load uppmax bioinfo-tools bwa/0.7.17 \
FastQC/0.11.9
Provides a conda environment file that lists dependencies and where to retrieve them.
To install, copy the contents in the code block to a file environment.yml
and install packages with mamba env update -f environment.yml
.
channels:
- conda-forge
- bioconda
- defauts
dependencies:
- bwa=0.7.17
- fastqc=0.12.1