Introduction to Configure and Make

Why compile your own software?

On a shared HPC system you typically don’t have root access, which means you can’t install software the usual way (e.g. apt install or yum install). Software that is needed for your research is usually installed centrally by support staff and made available via the module system (module load). This works well for widely used tools, but it can take time, and for more niche software it might never happen.

The good news is that you don’t need root access to compile and install software! You just need to tell the installer to put the files somewhere you own, like your home directory or a project directory.

Tip

Before compiling from source, check if the software is available via Docker, Conda/Pixi, or Apptainer. These containerized/prepackaged approaches are more portable and easier to deploy. This tutorial focuses on the manual compilation process, which is often more complex but sometimes necessary.

Background

To follow along in these steps, clone the Training-Tech-shorts repo to get the example files:

git clone https://github.com/NBISweden/Training-Tech-shorts
cd Training-Tech-shorts/posts/2026-04-23-configure-and-make/

You might not have programmed in languages that needs compiling before, and instead used scripting languages like R or Python. Scripting languages requires that the one running the program has the language interpreter installed, i.e. to run a python script, you must have python installed.

manual-compile/hello.py

#!/usr/bin/env python3

print("Hello, World!")

and to run it:

python hello.py

First we will cover a couple of basic steps when it comes to programming with compiled languages like C/C++. These basic steps are likely not something you will do when installing software that is created by someone else, but they will show why we will need to do the later steps.

How to compile from source code

We’ll start from scratch with a simple C++ hello world program:

manual-compile/hello.cpp

#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

To be able to run this program we first have to compile it. Compiling is when you translate the human readable C++ code to binary machine code that the computer can run.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart LR
    src("📄 hello.cpp<br/>(your code)")
    cmd{{"g++ hello.cpp -o hello"}}
    bin("⚙️  hello<br/>(runnable!)")

    src --> cmd --> bin

# compile the code using the GNU C++ compiler and create the
# executable binary named `hello`
g++ hello.cpp -o hello

# run it from the terminal
./hello
Hello, World!

If it is only a single cpp file that needs compiling, this manual approach is fine. However, more complex programs have multiple files that need compiling and some have to be compiled in the right order.

Compiling using a Makefile

When a project grows to many files, typing every command by hand becomes tedious and error-prone. With a Makefile containing all the commands, just type make and it handles the rest.

flowchart LR
    cpps@{ shape: procs, label: "📄 .cpp files"}
    makefile("📒 Makefile")
    cmd{{"make"}}
    bin("⚙️  hello<br/>(runnable!)")

    cpps --> makefile --> cmd --> bin

Here is an example of how a simple Makefile can look:

makefile-compile/Makefile

# target_file : source_files 
# <tab>commands

# Rule to build the final executable "my_program"
1my_program: main.cpp engine.cpp utils.cpp
    # Compile all source files and output the executable with -o flag
    g++ main.cpp engine.cpp utils.cpp -o my_program

# Rule to remove the compiled executable, helping to force a fresh build
2clean:
    # Remove "my_program" if it exists, the -f flag suppresses errors if it doesn't
    rm -f my_program

1: The executable my_program (target) depends on these three source files.
- make my_program (or make since it’s the first rule) only runs if my_program doesn’t exist or the source files change.
- The recipe is indented using a Tab character (leading spaces cause an error).
- Each line runs in its own subshell (it doesn’t know what’s run before - like a directory change).
2: The file clean doesn’t exist, so the command make clean always runs the recipe.

and to run it you just type

make

Compile using build system generators

Alright, now we know the background, that we need to compile .cpp files and we should use a Makefile to simplify the process.

Writing a Makefile by hand is itself complicated, especially if your project needs to work on different operating systems. Different operating systems mean different options to the compiler, and writing a Makefile to suit every combination of options is not feasible. Build system generators like Autotools and CMake let you write a simpler config file, and they generate the Makefile for you automatically, adjusting it to the computer you are compiling on. Once that Makefile has been generated you proceed to running it.

Which one you will use is decided by the creator of the software. A good start is to always reading the README or INSTALL file in the git repo of the software you want to compile. These files will contain instructions and suggestions for compiling that specific software.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart LR
    bsg("👷 Build system generator")
    makefile("📒 Makefile")
    cmd{{"make"}}
    bin("⚙️  hello<br/>(runnable!)")

    bsg --> makefile
    makefile --> cmd --> bin

Prefix

Both Autotools and CMake can be tweaked by giving it options, and the option that controls where the software ends up when it is installed is called prefix. It is the path to the directory where all the compiled files will be copied to once the compilation is complete. This could be any folder you have write access to, such as your home directory or project directory.

For the remainder of the tutorial $PREFIX will point to ~/sw to keep it short. Adjust it to suit your needs. If there are large programs you install you might fill up the quota in your home directory, so try to put it in project directories. Special consideration for NAISS systems, make sure that continuations of your project use the same directory name as the previous project, so that all paths remain the same.

# more realistic example
# export PREFIX=/proj/naiss2026-99-999/software

# used in this tutorial
export PREFIX=~/sw
mkdir -p $PREFIX

Compiling using autotools

Autotools is an older but still widely used compilation tool collection. If you have a file named either configure or configure.ac in the root of your git repo (or downloaded release of source code) it is a sign that Autotools is used.

If you only have configure.ac and configure is missing, you have to run an additional command to generate configure from configure.ac.

Doing the compilation usually involves these steps:

# if you only have configure.ac
autoreconf -fi                  # 0. generates configure, only run if configure is missing

# once you have configure
./configure --prefix=$PREFIX    # 1. detect your system, check dependencies, generate a Makefile
make                            # 2. compile the source code into binaries
make install                    # 3. copy the binaries to the target location

./configure is a shell script that probes your system: is gcc available? where is zlib? does your system support certain features? At the end it writes a Makefile tailored to your system. This is also where you can specify lots of options, like where it should install the software once compilation is done (--prefix). Run ./configure --help to see the list of options for the software you are installing.

make reads the Makefile and runs the compiler. This is where the actual work happens. It can take anywhere from a few seconds to a very long time depending on the software and speed of the computer.

make install copies the compiled binaries, libraries, and documentation to the location specified by --prefix. Without --prefix, this would default to a system-wide location which requires root. With it, you point it at a directory you own.

Example: Installing Samtools

samtools is one of the most widely used tools in genomics as it lets you manipulate SAM/BAM alignment files. It is a good example of the full configure + make workflow.

Download the source

# go somewhere sensible to do our building
# (we don't want to litter our home directory)
mkdir -p ~/build
cd ~/build

# download the samtools release tarball from github
wget https://github.com/samtools/samtools/releases/download/1.23.1/samtools-1.23.1.tar.bz2

# unpack it
tar -xjf samtools-1.23.1.tar.bz2

# go into the source directory
cd samtools-1.23.1

# have a look at what's in here
ls -l

You’ll see the configure script and the source files (.c, .h).

# it's always worth reading the README and INSTALL files before you start
# they often tell you about dependencies you need to have loaded
cat INSTALL

Configure

# run configure, pointing it at our prefix directory
./configure --prefix=$PREFIX

# you'll see a lot of output as it checks for things.
# at the end it prints a summary, look for any warnings.

If configure finishes without errors, a Makefile has been written for your system. You can have a look at it, but it’s usually thousands of lines of generated code, not meant to be read by humans.

Compile

# compile samtools
# the -j flag tells make to use multiple cores in parallel,
# which makes this significantly faster on multi-core nodes.
# 4 is a safe choice on the login node. on a compute node you can go higher.
make -j4

This is where the actual compilation happens. You’ll see lots of lines like gcc -O2 -Wall .... When it finishes without errors you have a samtools binary sitting in the current directory:

# try running it before installing
./samtools --version

Install

# copy samtools to the prefix directory
make install

# check what was installed
ls -l $PREFIX/bin/

The bin/ directory now contains the samtools executable, and a bunch of other tools that samtools uses.

Compiling using CMake

Not all software uses autotools (configure + make). cmake is another very common build system, especially for C++ projects. If you see a CMakeLists.txt file in the source directory instead of a configure script, you’re looking at a cmake project.

The pattern is similar but slightly different:

# typical cmake build
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release
make -j4
make install

Example: Installing DIAMOND

DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data, and it is built using CMake.

cd ~/build

# download the source
wget https://github.com/bbuchfink/diamond/archive/refs/tags/v2.1.9.tar.gz
tar -xzf v2.1.9.tar.gz
cd diamond-2.1.9

# confirm it uses cmake
ls -l
# CMakeLists.txt  <-- yep

# create build dir
mkdir build && cd build

# create Makefile
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release

# run Makefile
make -j4

# test that it works
./diamond --version

# install it
make install

Making the tools available in your shell

Right now the binaries are in $PREFIX/bin/, but your shell doesn’t know to look there. You need to add it to your PATH:

# add the bin directory to PATH for this session
export PREFIX=~/sw
export PATH=$PREFIX/bin:$PATH

# test that it works
which samtools
samtools --version

which diamond
diamond --version

To make this permanent, add it to your ~/.bashrc:

# Run these commands in your terminal to update your `~/.bashrc` file and make sure
# to update the PREFIX to your actual prefix path
echo 'export PREFIX=~/sw' >> ~/.bashrc
echo 'export PATH=$PREFIX/bin:$PATH' >> ~/.bashrc

# reload the file to apply the changes in the current session
source ~/.bashrc

If you also installed shared libraries, you may need to tell the dynamic linker where to find them:

# add to ~/.bashrc alongside the PATH line
echo 'export LD_LIBRARY_PATH=$PREFIX/lib:$LD_LIBRARY_PATH' >> ~/.bashrc

Summary

Build system generators are used to create Makefiles that will handle compilation of many .cpp files.

Autotools: will contain a configure or a configure.ac file.

# typical autotools workflow

autoreconf -fi  # only if configure is missing
./configure --prefix=$PREFIX
make -j4
make install

CMake: will contain a CMakeLists.txt file.

# typical cmake workflow

mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX -DCMAKE_BUILD_TYPE=Release
make -j4
make install

Extras

Here are a few things that are useful to know once you start doing this more regularly.

When libraries can’t be found

Sometimes ./configure or cmake will fail with something like:

configure: error: zlib development files not found

This means the development headers for the zlib library are not in the default search paths. On a cluster, you would look for and load a module:

module spider zlib
module load zlib/1.2.13

# then try configure again
./configure --prefix=$PREFIX

If the library is installed but in an unusual location, you can point configure to it using CPPFLAGS and LDFLAGS:

ZLIB_DIR=/some/non-standard/path

./configure --prefix=$PREFIX \
    CPPFLAGS="-I${ZLIB_DIR}/include" \
    LDFLAGS="-L${ZLIB_DIR}/lib"

Keeping things organized

When you start installing many tools this way, it is easy to lose track of what is installed and where. A few conventions that help:

# one option: a shared prefix directory for all tools
/proj/your-project-id/software/

# another option: one directory per tool and version
/proj/your-project-id/software/bwa/0.7.18/
/proj/your-project-id/software/samtools/1.21/

The second approach makes it easy to have multiple versions available and switch between them, but requires updating PATH for each tool.