Version: AMD_ZEN_HPCG_2024-10-07
Description:
The High-Performance Conjugate Gradients (HPCG) Benchmark project has been created as a new metric for ranking HPC systems. It is based on a preconditioned conjugate gradient method, that solves Ax = b, where A is a sparse square matrix. The applied preconditioner is a multigrid v-cycle iteration, smoothed by a forward and backward Gauss-Seidel sweep.
This version is derived from The High-Performance Conjugate Gradients (HPCG) Benchmark (Revision: 3.1 - Date: March 28, 2019) and has been optimized to run on AMD EPYC CPUs.
Dependencies:
- OpenMPI 4/5: The binaries were built against OpenMPI-5.0.3 and should run without issue if OpenMPI 5 or 4 is in the environment.
- The binary was built on Red Hat® Enterprise Linux® 8.9 and tested on Red Hat® Enterprise Linux® 9, Ubuntu Linux 22.04.
Recommended Settings:
- Boost: ON
- Transparent Hugepages: always
- SMT: OFF
- NPS: 4
- Determinism: Power
How to Run:
- Ensure OpenMPI is installed and loaded in your environment.
- Place the supplied
hpcg.dat
file in the same directory as the AMD Zen HPCG binaries. Modifyhpcg.dat
as per your requirement.- By default, hpcg.dat will define a very small problem, where the 2nd line represents values of
nx
,ny
, andnz
, respectively and the 3rd line represents the runtime. To ensure valid benchmark runs, the problem size should be chosen such that the benchmark utilizes at least 1/4th of the total available main memory, and the runtime should be a minimum of 1800 seconds. - Alternatively, you may pass these arguments in the command line
Example:
–nx=<value> –ny=<value> –nz=<value> –rt=<value>
Note: These parameters will override the values set in hpcg.dat.
- By default, hpcg.dat will define a very small problem, where the 2nd line represents values of
- Example Run Command for Single Node
- For a short run on AMD 3rd Generation EPYC™ CPU, Dual Socket with 64 Cores/socket and 512 GB RAM
mpirun -np 32 --bind-to core --map-by ppr:2:l3cache:pe=4 -x OMP_NUM_THREADS=4 -x OMP_PROC_BIND=true -x OMP_PLACES=cores ./amd_hpcg --nx=192 --ny=192 --nz=192 --rt=60` - For a short run on AMD 4th Generation EPYC™ CPU, Dual Socket with 96 Cores/socket and 1.5TB RAM
mpirun -np 96 --bind-to core --map-by ppr:4:l3cache:pe=2 -x OMP_NUM_THREADS=2 -x OMP_PROC_BIND=true -x OMP_PLACES=cores ./amd_hpcg --nx=192 --ny=192 --nz=192 --rt=60` - For a short run on AMD 5th Generation EPYC™ CPU, Dual Socket with 128 Cores/socket and 1.5TB RAM
mpirun -np 128 --bind-to core --map-by ppr:4:l3cache:pe=2 -x OMP_NUM_THREADS=2 -x OMP_PROC_BIND=true -x OMP_PLACES=cores ./amd_hpcg --nx=192 --ny=192 --nz=192 --rt=60`
- For a short run on AMD 3rd Generation EPYC™ CPU, Dual Socket with 64 Cores/socket and 512 GB RAM