Population Genomics in Practice 2025
  • Slides
  • Exercises
  • Code recipes
  1. Exercises
  2. Data
  3. Compute environment
  • Slides
    • Listing
    • Introduction
      • Population genomics in practice
    • Population genetics foundations
      • Listing
      • Data and definitions
      • Alleles and genealogies
      • Linkage disequilibrium
      • The Wright-Fisher model
      • Genetic diversity
      • Selection
    • Variant calling
      • Listing
      • DNA sequencing data
      • Read mapping
      • Variant calling and genotyping
      • Variant calling workflows
    • Variant filtering
      • Listing
      • Variant filtering
      • Depth filtering
    • Genetic diversity
      • Listing
      • Genetic diversity
    • Population structure
      • Listing
      • Principal component analysis
      • Admixture
    • Demography
      • Listing
    • Selection
      • Listing
    • Simulation
      • Listing
      • Brief introduction to simulation packages and stdpopsim
      • Primer on the coalescent and forward simulation
      • Ancestral recombination graph inference
  • Exercises
    • Listing
    • Data
      • Compute environment
      • Monkeyflowers dataset
    • Variant calling
      • Listing
      • Variant calling introduction
      • Data quality control
      • Read mapping and duplicate removal
      • Variant calling workflow
    • Variant filtering
      • Listing
      • Basic variant filtering
      • Depth filtering on invariant sites
    • Recombination and linkage
      • Listing
      • Linkage disequilibrium decay
    • Genetic diversity
      • Listing
      • Genetic diversity landscapes
    • Population structure
      • Listing
      • Principal component analysis
      • Admixture
      • D-statistics
    • Simulation
      • Listing
      • HOWTO
      • Introduction to stdpopsim
      • Simulating selection with stdpopsim
      • Introduction to simulation with msprime
  • Code recipes
    • Code recipes

On this page

  • Dardel @ PDC
    • Working directory setup
    • pixi environments and pgip CLI
    • ThinLinc
    • Interactive jobs
    • Accessing notebooks on compute nodes
    • Tutorials
  • Jupyter Notebooks
    • JupyterLite
  • Pixi
  • Tools
  • References
  1. Exercises
  2. Data
  3. Compute environment

Compute environment

Description of compute resources
Author

Per Unneberg

Published

18-Sep-2025

This page briefly describes different compute environments that may be used for exercises. We use the following symbols to icons that indicate the type of environment ( HPC resource; local compute environment; online browser-based resource). Make sure to read these instructions before proceeding with the exercises.

Dardel @ PDC

Prerequisite: SUPR account

If you want to run the exercises on the Dardel HPC you need an account. Follow the instructions at the precourse page.

We will primarily be using KTH’s high-performance computing (HPC) center Dardel to run exercises. Course material will be hosted in a dedicated course project directory /cfs/klemming/projects/supr/pgip_2025.

Working directory setup

We recommend you setup a working directory based on your username in /cfs/klemming/projects/supr/pgip_2025/users in which to run your exercises:

mkdir -p /cfs/klemming/projects/supr/pgip_2025/users/YOURUSERNAME
cd /cfs/klemming/projects/supr/pgip_2025/users/YOURUSERNAME

pixi environments and pgip CLI

EXPERIMENTAL

This feature is experimental and may not work as intended.

In order to improve reproducibility and facilitate package setup, we have grouped exercise tools in pixi virtual environments1 Exercise environments are named e-EXERCISE-NAME and can be activated with a custom command line (CLI) tool called pgip. To activate the CLI, do the following steps:

source /cfs/klemming/projects/supr/pgip_2025/init.sh
pgip_activate

Now you should have access to pgip, which among other things lets you setup exercise data and launch notebooks. In addition, there are two commands pgip_elist and pgip_shell. pgip_elist lists available environments and pgip_shell ENVIRONMENT_NAME starts a shell with the environment activated.

Run pgip_activate before pgip_shell

It is important that you run pgip_activate first as exiting would terminate your session. Exiting from the shell will pop you back to the activated default environment.

ThinLinc

KTH provides a remote desktop program called ThinLinc which lets you connect to a remote server and access programs via a virtual desktop. To use, you first need to download the ThinLinc client (tlclient). To connect, launch tlclient and authenticate with either Kerberos or SSH. See the PDC documentation for more documentation.

Launching graphical applications

The is a launcher located in the menu bar that should let you start graphical applications. If that doesn’t work, you can always

Interactive jobs

Please do not book more than 10 cores

We have priviliged access to a limited number of nodes. Please do not book more than 10 cores or else your fellow students will experience long waiting times.

Make sure to login to a compute node before running any compute-intensive commands

All computations should be run on a compute node. You can request an interactive session with the salloc command. For example, to request an eight hour job on 10 cores, run

salloc -A naiss2025-22-825 -n 10 \
   --time 08:00:00 \
   --reservation=<name of reservation> \
   --no-shell

where <name of reservation> needs to be replaced by the node reservations, which will typically be unique for every day.

The --no-shell option will immediately exit the allocated node but keep the Slurm job active. Make a note of the allocated node ID and use ssh to access it.

ssh node_id

Accessing notebooks on compute nodes

See ThinLinc for a potentially easier way to access compute nodes and resources.

Notebooks are run in browser sessions, but if run on compute nodes they are not directly accessible from the client. The trick is to set up double port forwarding, in which a port is forwarded from the compute node to the login node and on to your client. If you use the pgip CLI to launch a notebook, it will look as follows:

pgip notebook jupyter --port 9999
INFO:pgip_cli.commands.notebook:notebook
INFO:pgip_cli.commands.notebook:Running jupyter lab  --no-browser --port=9999
INFO:pgip_cli.commands.notebook:Jupyter lab running at http://localhost:9999
INFO:pgip_cli.commands.notebook:To stop Jupyter Notebook, press Ctrl+C
INFO:pgip_cli.commands.notebook:
INFO:pgip_cli.commands.notebook:For port forwarding to login node, on login node run:
INFO:pgip_cli.commands.notebook:ssh -L 9999:localhost:9999 nid002581 -N -f
INFO:pgip_cli.commands.notebook:
INFO:pgip_cli.commands.notebook:For port forwarding to client localhost, on client run:
INFO:pgip_cli.commands.notebook:ssh -L 9999:localhost:9999 dardel.pdc.kth.se -N -f
INFO:pgip_cli.commands.notebook:
Do you want to continue? [Y/n]:

Here, NODEID is the compute node id. To enable port forwarding, you need to run the to ssh -L ... commands, and voilà, you should be able to access the notebook at localhost:9999.

The port id must be unique and in the recommended range 1024-49151

Tutorials

PDC hosts tutorials and user guides at https://support.pdc.kth.se/doc. In particular, https://support.pdc.kth.se/doc/basics/quickstart has information on how to connect to and work on Dardel.

Jupyter Notebooks

Jupyter Notebook exercises will be run in local compute environments on your laptop. See the section below on setting up a pgip environment with pixi, which by default installs jupyter and its dependencies.

JupyterLite

There are some Jupyter Notebook exercises that are hosted online and run using JupyterLite which is a JupyterLab distribution that runs entirely in the browser. Apart from having a browser, no preparations are necessary. Note that some users have reported issues with Firefox and that Google Chrome may be a better solution.

Pixi

Exercises that require local software installation will make use of the pixi package manager to install necessary conda requirements from the package repositories bioconda and conda-forge. This is also the fallback solution in case there are issues with the HPC.

To start using pixi, follow the install instructions to install. You can choose to run all exercises on a local computer if you have pixi setup. You would then need to download relevant data, as detailed for each exercise. We have tried to make the exercise data sets small such that you can run the exercises locally.

Tools

Computer exercise requirements are listed in Tools callout blocks in each exercise. The Tools callout block contains listings of programs, along with package dependencies and specifications for Dardel and pixi, whenever relevant. An example block is shown below.

Tools - example

Example Tools block.

  • Listing
  • PDC
  • pixi

Provides list of packages linked to repository, and citation when available.

  • fastqc
  • bwa (Li, 2013)

Choose one of Modules and Virtual environment to access relevant tools.

Modules

Execute the following command to load modules:

module load bwa/0.7.18 fastqc/0.12.1
Virtual environment

Run the pgip initialization script and activate the pgip default environment:

source /cfs/klemming/projects/supr/pgip_2025/init.sh
pgip_activate

Then activate the <exercise environment> environment:

# pgip_shell calls pixi shell -e <exercise environment> --as-is
pgip_shell <exercise environment>

Provides a pixi manifest file that lists dependencies and where to retrieve them.

Copy the contents to a file pixi.toml in directory exercise-name, cd to directory and activate environment with pixi shell:

[workspace]
channels = ["conda-forge", "bioconda"]
name = "exercise-name"
platforms = ["linux-64"]

[dependencies]
bwa = ">=0.7.19,<0.8"
fastqc = ">=0.12.1,<0.13"

References

Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997 [q-Bio]. https://arxiv.org/abs/1303.3997

Footnotes

  1. pixi is a fast package management tool, similar to conda.↩︎

2025 NBIS | GPL-3 License

 

Published with Quarto v