Automated Benchmarking using JUBE

From HPC Wiki
Benchmarking & Scaling Tutorial/Automated Benchmarking JUBE /
Revision as of 19:55, 24 June 2025 by Marc-andre-hermanns-bc32@rwth-aachen.de (talk | contribs) (Adding output parsing and result table)
Jump to navigation Jump to search

Tutorial
Title: Benchmarking & Scaling
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Online
Topic Area: Performance Analysis
License: CC-BY-SA
Syllabus

1. Introduction & Theory
2. Interactive Manual Benchmarking
3. Automated Benchmarking using a Job Script
4. Automated Benchmarking using JUBE
5. Plotting & Interpreting Results

Introduction

The Jülich Benchmarking Environment is an application that helps you automate your workflow for system and application benchmarking.

JUBE allows you to define different steps of your workflow with dependencies between them.

One key advantage of using JUBE, as opposed to manually running an application in different configurations in a job script is that individual run configurations are automatically separated into separate workpackages with individual run directories, while common files and directories (like input files, preprocessing, etc.) can easily be integrated into the workflow.

Furthermore, application output (such as the runtime of the application) can easily be parsed and output in CSV or human-readable table format.

Writing a minimal configuration

As JUBE executes each workpackage (step with concrete configuration) in its own sandbox, the benchmark configuration must specify a fileset that either copies or links files into the run directory. Parameters have a separator defined (default is ',') that is used to tokenize the parameter string. Each token will be part of a separate configuration. In this example, the comma-separated list of tasks will result in the parameter tasks with one specific value in 15 different workpackages.

 1<?xml version="1.0" encoding="UTF-8"?>
 2<jube>
 3  <benchmark name="GROMACS" outpath="bench_run">
 4    <comment>A minimal JUBE config to run our GROMACS example</comment>
 5
 6    <!-- Configuration -->
 7    <parameterset name="execute_pset">
 8      <parameter name="tasks">1,2,4,8,12,18,24,30,36,42,48,54,60,66,72</parameter>
 9      <!-- you could also compute the list using Python
10         <parameter name="tasks" mode="python">
11           ",".join([str(x) for x in range(1,73) if (x % 6 == 0 and x > 10) or (x % 4 == 0 and x < 10) or x == 2 or x==1 ])
12         </parameter>
13      -->
14    </parameterset>
15
16    <!-- Input files -->
17    <fileset name="gromacs_files">
18      <link>MD_5NM_WATER.deff</link> <!-- link input file -->
19    </fileset>
20
21    <!-- Operation -->
22    <step name="run">
23      <use>execute_pset</use>  <!-- use parameterset -->
24      <use>gromacs_files</use> <!-- use fileset -->
25      <do>srun -n $tasks gmx_mpi -quiet mdrun -deffnm MD_5NM_WATER -nsteps 10000 -ntomp 1 -pin on</do> <!-- start GROMACS -->
26    </step>
27  </benchmark>
28</jube>

This configuration will already create separate directories for each of the measurements, which makes sure that temporary files written by the application do not interact across the different measurements.

Parsing output

GROMACS outputs performance numbers to `stderr`. As identifying such output from executions is a core part of benchmarking, JUBE provides infrastructure to parse output and store specific information to output this information later in result tables.

1    <patternset name="gromacs_output_patterns">
2        <pattern name="gromacs_num_procs" unit="s">Using ${jube_pat_int} MPI proc.*</pattern>
3        <pattern name="gromacs_num_threads" unit="s">Using ${jube_pat_int} OpenMP thread.*</pattern>
4        <pattern name="gromacs_core_time" unit="s">Time:\s*${jube_pat_fp}</pattern>
5        <pattern name="gromacs_wall_time" unit="s">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
6        <pattern name="gromacs_core_perf" unit="ns/day">Time:\s*${jube_pat_fp}</pattern>
7        <pattern name="gromacs_wall_perf" unit="hours/ns">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
8    </patternset>

The pattern matching is done line based with regular expressions and JUBE provides predefined variables, such as ${jube_pat_int} and ${jube_pat_fp} that contain the regular expression pattern to match an integer or floating-point number, respectively. The defined patterns can then be used in a so called analyser, where the patterns are connected to the file they are applied to.

1    <analyser name="gromacs_analyser">
2        <analyse step="run">
3            <file use="gromacs_output_patterns">stderr</file>
4        </analyse>
5    </analyser>

Finally, result tables can be defined with columns referencing any defined parameter or pattern.

 1    <result>
 2        <use>gromacs_analyser</use>
 3        <table name="gromacs_run" style="pretty">
 4            <column title="wp">jube_wp_id</column>
 5            <column>gromacs_core_time</column>
 6            <column>gromacs_wall_time</column>
 7            <column>gromacs_core_perf</column>
 8            <column>gromacs_wall_perf</column>
 9        </table>
10    </result>

Resulting in the following output.

$ jube result -a jube_run --id <jube_run_id>
gromacs_run:
| wp | tasks | gromacs_core_time[s] | gromacs_wall_time[s] | gromacs_core_perf[ns/day] | gromacs_wall_perf[hours/ns] |
|----|-------|----------------------|----------------------|---------------------------|-----------------------------|
|  0 |     1 |               44.366 |               44.366 |                    44.366 |                      44.366 |
|  1 |     2 |               46.942 |               23.471 |                    46.942 |                      23.471 |
|  2 |     4 |               49.548 |               12.387 |                    49.548 |                      12.387 |
|  3 |     8 |               52.969 |                6.621 |                    52.969 |                       6.621 |
|  4 |    12 |               59.370 |                4.948 |                    59.370 |                       4.948 |
|  5 |    18 |               66.097 |                3.672 |                    66.097 |                       3.672 |
|  6 |    24 |               76.391 |                3.183 |                    76.391 |                       3.183 |
|  7 |    30 |               89.233 |                2.975 |                    89.233 |                       2.975 |
|  8 |    36 |               91.187 |                2.533 |                    91.187 |                       2.533 |
|  9 |    42 |               99.743 |                2.375 |                    99.743 |                       2.375 |
| 10 |    48 |              183.114 |                3.815 |                   183.114 |                       3.815 |
| 11 |    54 |              121.728 |                2.255 |                   121.728 |                       2.255 |
| 12 |    60 |              199.882 |                3.332 |                   199.882 |                       3.332 |
| 13 |    66 |                      |                      |                           |                             |
| 14 |    72 |              116.555 |                1.619 |                   116.555 |                       1.619 |


Further information


Next: Plotting and Interpreting Results

Previous: Automated Benchmarking using a Job Script