Efficiency using `seff`¶

There are multiple tools for using your HPC resources efficiently you may need. This page is about using your HPC resources efficiently using the seff tool.

Here is the general strategy to effectively use your HPC resources:

Want to see a video?

Watch the YouTube video obtain the CPU and memory usage of a job using seff to see how to do so.

Watch the YouTube video Efficient HPC resource use, using Slurm and seff to see how the reasoning of this strategy works out.

flowchart TD
  obtain_data[Obtain CPU and memory usage of a job]
  lower_limit_based_on_memory(Book enough memory)
  limited_by_cpu(For that amount of cores, would runtime by limited by CPU?)
  lower_limit_based_on_cpu(Increase the number of cores, so that on average, the right amount of CPUs is booked)

  done(Use that amount of cores)

  add_one(Increase the number of cores by one for safety)

  obtain_data --> lower_limit_based_on_memory
  lower_limit_based_on_memory --> limited_by_cpu
  limited_by_cpu --> |no| add_one
  limited_by_cpu --> |yes| lower_limit_based_on_cpu
  lower_limit_based_on_cpu --> done
  add_one --> done

Why not look at CPU usage?

Because CPU is more flexible.

For example, imagine a job with a short CPU spike, that can be processed by 16 CPUs. If 1 core has enough memory, use 1 core of memory: the CPU spike will be turned into a 100% CPU use (of that one core) for a longer duration.

To obtain the CPU and memory usage of a job using seff:

seff [job_number]

for example:

seff 12696175

This will produce output such as this:

Job ID: 12696175
Cluster: dardel
User/Group: aletyner/aletyner
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 160
CPU Utilized: 00:00:03
CPU Efficiency: 0.00% of 1-23:22:40 core-walltime
Job Wall-clock time: 00:17:46
Memory Utilized: 4.35 GB
Memory Efficiency: 3.17% of 137.19 GB (878.00 MB/core)
The task which had the largest memory consumption differs by 102.24% from the average task max memory consumption

Need a worked-out example?

.

Book enough memory

.

For that amount of cores, would runtime by limited by CPU?

.

Increase the number of cores by one for safety

.

Need another worked-out example?

Book enough memory

.

For that amount of cores, would runtime by limited by CPU?

.

Increase the number of cores, so that on average the right amount of CPUs are booked

.

Sometimes, however, it is inevitable to use resources inefficiently.

Examples¶

Here are some examples of how inefficient jobs can look and what you can do to make them more efficient.

Inefficient job example 1: booking too much cores¶

Job ID: 12696175
Cluster: dardel
User/Group: aletyner/aletyner
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 160
CPU Utilized: 00:00:03
CPU Efficiency: 0.00% of 1-23:22:40 core-walltime
Job Wall-clock time: 00:17:46
Memory Utilized: 4.35 GB
Memory Efficiency: 3.17% of 137.19 GB (878.00 MB/core)
The task which had the largest memory consumption differs by 102.24% from the average task max memory consumption

Here booking 7 cores is considered okay.

Book enough memory

In this job, only 3.17% of the memory of was used. 3.17% of 160 scheduled cores is 5.072 core. In practice, this will be 6 cores.

For that amount of cores, would runtime by limited by CPU?

The answer is 'no': we see a CPU efficiency of 0.00% (i.e. 0.0049% or lower). Hence, using reducing the number of cores to 3.17% will still be enough for the CPU.

Increase the number of cores by one for safety

This means booking 7 cores is recommended.

Efficiency using seff¶

Examples¶

Inefficient job example 1: booking too much cores¶

Efficiency using `seff`¶