---
config:
look: handDrawn
theme: neutral
---
flowchart LR
src("📄 hello.cpp<br/>(your code)")
cmd{{"g++ hello.cpp -o hello"}}
bin("⚙️ hello<br/>(runnable!)")
src --> cmd --> bin
Why compile your own software?
On a shared HPC system you typically don’t have root access, which means you can’t install software the usual way (e.g. apt install or yum install). Software that is needed for your research is usually installed centrally by support staff and made available via the module system (module load). This works well for widely used tools, but it can take time, and for more niche software it might never happen.
The good news is that you don’t need root access to compile and install software! You just need to tell the installer to put the files somewhere you own, like your home directory or a project directory.
Before compiling from source, check if the software is available via Docker, Conda/Pixi, or Apptainer. These containerized/prepackaged approaches are more portable and easier to deploy. This tutorial focuses on the manual compilation process, which is often more complex but sometimes necessary.
Background
To follow along in these steps, clone the Training-Tech-shorts repo to get the example files:
git clone https://github.com/NBISweden/Training-Tech-shorts
cd Training-Tech-shorts/posts/2026-04-23-configure-and-make/You might not have programmed in languages that needs compiling before, and instead used scripting languages like R or Python. Scripting languages requires that the one running the program has the language interpreter installed, i.e. to run a python script, you must have python installed.
manual-compile/hello.py
#!/usr/bin/env python3
print("Hello, World!")and to run it:
python hello.pyFirst we will cover a couple of basic steps when it comes to programming with compiled languages like C/C++. These basic steps are likely not something you will do when installing software that is created by someone else, but they will show why we will need to do the later steps.
How to compile from source code
We’ll start from scratch with a simple C++ hello world program:
manual-compile/hello.cpp
#include <iostream>
int main() {
std::cout << "Hello, World!" << std::endl;
return 0;
}To be able to run this program we first have to compile it. Compiling is when you translate the human readable C++ code to binary machine code that the computer can run.
# compile the code using the GNU C++ compiler and create the
# executable binary named `hello`
g++ hello.cpp -o hello
# run it from the terminal
./hello
Hello, World!If it is only a single cpp file that needs compiling, this manual approach is fine. However, more complex programs have multiple files that need compiling and some have to be compiled in the right order.
Compiling using a Makefile
When a project grows to many files, typing every command by hand becomes tedious and error-prone. With a Makefile containing all the commands, just type make and it handles the rest.
flowchart LR
cpps@{ shape: procs, label: "📄 .cpp files"}
makefile("📒 Makefile")
cmd{{"make"}}
bin("⚙️ hello<br/>(runnable!)")
cpps --> makefile --> cmd --> bin
Here is an example of how a simple Makefile can look:
makefile-compile/Makefile
# target_file : source_files
# <tab>commands
# Rule to build the final executable "my_program"
1my_program: main.cpp engine.cpp utils.cpp
# Compile all source files and output the executable with -o flag
g++ main.cpp engine.cpp utils.cpp -o my_program
# Rule to remove the compiled executable, helping to force a fresh build
2clean:
# Remove "my_program" if it exists, the -f flag suppresses errors if it doesn't
rm -f my_program- 1
-
The executable
my_program(target) depends on these three source files.
-make my_program(ormakesince it’s the first rule) only runs ifmy_programdoesn’t exist or the source files change.
- The recipe is indented using a TabTab character (leading spaces cause an error).
- Each line runs in its own subshell (it doesn’t know what’s run before - like a directory change). - 2
-
The file
cleandoesn’t exist, so the commandmake cleanalways runs the recipe.
and to run it you just type
makeCompile using build system generators
Alright, now we know the background, that we need to compile .cpp files and we should use a Makefile to simplify the process.
Writing a Makefile by hand is itself complicated, especially if your project needs to work on different operating systems. Different operating systems mean different options to the compiler, and writing a Makefile to suit every combination of options is not feasible. Build system generators like Autotools and CMake let you write a simpler config file, and they generate the Makefile for you automatically, adjusting it to the computer you are compiling on. Once that Makefile has been generated you proceed to running it.
Which one you will use is decided by the creator of the software. A good start is to always reading the README or INSTALL file in the git repo of the software you want to compile. These files will contain instructions and suggestions for compiling that specific software.
---
config:
look: handDrawn
theme: neutral
---
flowchart LR
bsg("👷 Build system generator")
makefile("📒 Makefile")
cmd{{"make"}}
bin("⚙️ hello<br/>(runnable!)")
bsg --> makefile
makefile --> cmd --> bin
Prefix
Both Autotools and CMake can be tweaked by giving it options, and the option that controls where the software ends up when it is installed is called prefix. It is the path to the directory where all the compiled files will be copied to once the compilation is complete. This could be any folder you have write access to, such as your home directory or project directory.
For the remainder of the tutorial $PREFIX will point to ~/sw to keep it short. Adjust it to suit your needs. If there are large programs you install you might fill up the quota in your home directory, so try to put it in project directories. Special consideration for NAISS systems, make sure that continuations of your project use the same directory name as the previous project, so that all paths remain the same.
# more realistic example
# export PREFIX=/proj/naiss2026-99-999/software
# used in this tutorial
export PREFIX=~/sw
mkdir -p $PREFIXCompiling using autotools
Autotools is an older but still widely used compilation tool collection. If you have a file named either configure or configure.ac in the root of your git repo (or downloaded release of source code) it is a sign that Autotools is used.
If you only have configure.ac and configure is missing, you have to run an additional command to generate configure from configure.ac.
Doing the compilation usually involves these steps:
# if you only have configure.ac
autoreconf -fi # 0. generates configure, only run if configure is missing
# once you have configure
./configure --prefix=$PREFIX # 1. detect your system, check dependencies, generate a Makefile
make # 2. compile the source code into binaries
make install # 3. copy the binaries to the target location./configure is a shell script that probes your system: is gcc available? where is zlib? does your system support certain features? At the end it writes a Makefile tailored to your system. This is also where you can specify lots of options, like where it should install the software once compilation is done (--prefix). Run ./configure --help to see the list of options for the software you are installing.
make reads the Makefile and runs the compiler. This is where the actual work happens. It can take anywhere from a few seconds to a very long time depending on the software and speed of the computer.
make install copies the compiled binaries, libraries, and documentation to the location specified by --prefix. Without --prefix, this would default to a system-wide location which requires root. With it, you point it at a directory you own.
Example: Installing Samtools
samtools is one of the most widely used tools in genomics as it lets you manipulate SAM/BAM alignment files. It is a good example of the full configure + make workflow.
Download the source
# go somewhere sensible to do our building
# (we don't want to litter our home directory)
mkdir -p ~/build
cd ~/build
# download the samtools release tarball from github
wget https://github.com/samtools/samtools/releases/download/1.23.1/samtools-1.23.1.tar.bz2
# unpack it
tar -xjf samtools-1.23.1.tar.bz2
# go into the source directory
cd samtools-1.23.1
# have a look at what's in here
ls -lYou’ll see the configure script and the source files (.c, .h).
# it's always worth reading the README and INSTALL files before you start
# they often tell you about dependencies you need to have loaded
cat INSTALLConfigure
# run configure, pointing it at our prefix directory
./configure --prefix=$PREFIX
# you'll see a lot of output as it checks for things.
# at the end it prints a summary, look for any warnings.If configure finishes without errors, a Makefile has been written for your system. You can have a look at it, but it’s usually thousands of lines of generated code, not meant to be read by humans.
Compile
# compile samtools
# the -j flag tells make to use multiple cores in parallel,
# which makes this significantly faster on multi-core nodes.
# 4 is a safe choice on the login node. on a compute node you can go higher.
make -j4This is where the actual compilation happens. You’ll see lots of lines like gcc -O2 -Wall .... When it finishes without errors you have a samtools binary sitting in the current directory:
# try running it before installing
./samtools --versionInstall
# copy samtools to the prefix directory
make install
# check what was installed
ls -l $PREFIX/bin/The bin/ directory now contains the samtools executable, and a bunch of other tools that samtools uses.
Compiling using CMake
Not all software uses autotools (configure + make). cmake is another very common build system, especially for C++ projects. If you see a CMakeLists.txt file in the source directory instead of a configure script, you’re looking at a cmake project.
The pattern is similar but slightly different:
# typical cmake build
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release
make -j4
make installExample: Installing DIAMOND
DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data, and it is built using CMake.
cd ~/build
# download the source
wget https://github.com/bbuchfink/diamond/archive/refs/tags/v2.1.9.tar.gz
tar -xzf v2.1.9.tar.gz
cd diamond-2.1.9
# confirm it uses cmake
ls -l
# CMakeLists.txt <-- yep
# create build dir
mkdir build && cd build
# create Makefile
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release
# run Makefile
make -j4
# test that it works
./diamond --version
# install it
make installMaking the tools available in your shell
Right now the binaries are in $PREFIX/bin/, but your shell doesn’t know to look there. You need to add it to your PATH:
# add the bin directory to PATH for this session
export PREFIX=~/sw
export PATH=$PREFIX/bin:$PATH
# test that it works
which samtools
samtools --version
which diamond
diamond --versionTo make this permanent, add it to your ~/.bashrc:
# Run these commands in your terminal to update your `~/.bashrc` file and make sure
# to update the PREFIX to your actual prefix path
echo 'export PREFIX=~/sw' >> ~/.bashrc
echo 'export PATH=$PREFIX/bin:$PATH' >> ~/.bashrc
# reload the file to apply the changes in the current session
source ~/.bashrcIf you also installed shared libraries, you may need to tell the dynamic linker where to find them:
# add to ~/.bashrc alongside the PATH line
echo 'export LD_LIBRARY_PATH=$PREFIX/lib:$LD_LIBRARY_PATH' >> ~/.bashrcSummary
Build system generators are used to create Makefiles that will handle compilation of many .cpp files.
Autotools: will contain a configure or a configure.ac file.
# typical autotools workflow
autoreconf -fi # only if configure is missing
./configure --prefix=$PREFIX
make -j4
make installCMake: will contain a CMakeLists.txt file.
# typical cmake workflow
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release
make -j4
make installExtras
Here are a few things that are useful to know once you start doing this more regularly.
When libraries can’t be found
Sometimes ./configure or cmake will fail with something like:
configure: error: zlib development files not found
This means the development headers for the zlib library are not in the default search paths. On a cluster, you would look for and load a module:
module spider zlib
module load zlib/1.2.13
# then try configure again
./configure --prefix=$PREFIXIf the library is installed but in an unusual location, you can point configure to it using CPPFLAGS and LDFLAGS:
ZLIB_DIR=/some/non-standard/path
./configure --prefix=$PREFIX \
CPPFLAGS="-I${ZLIB_DIR}/include" \
LDFLAGS="-L${ZLIB_DIR}/lib"Keeping things organized
When you start installing many tools this way, it is easy to lose track of what is installed and where. A few conventions that help:
# one option: a shared prefix directory for all tools
/proj/your-project-id/software/
# another option: one directory per tool and version
/proj/your-project-id/software/bwa/0.7.18/
/proj/your-project-id/software/samtools/1.21/The second approach makes it easy to have multiple versions available and switch between them, but requires updating PATH for each tool.