HPCG | AMD

AMD Zen Software Studio with Spack

Open MPI with AMD Zen Software Studio

Micro Benchmarks/Synthetic Benchmarks

Spack HPC Applications

Introduction

The High-Performance Conjugate Gradients (HPCG) Benchmark project is an effort to create a new metric for ranking HPC systems. HPCG is intended as a complement to the High Performance LINPACK (HPL) benchmark, currently used to rank the TOP500 computing systems. The computational and data access patterns of HPL are still representative of some important scalable applications, but not all. HPCG is designed to exercise computational and data access patterns that more closely match a different and broad set of important applications, and to give incentive to computer system designers to invest in capabilities that will have impact on the collective performance of these applications.

Official Website: https://www.hpcg-benchmark.org/

For best benchmark scores on AMD ZEN architectures, we recommend using Zen HPCG binaries which are optimized for EPYC platforms. For further details, refer to AMD Zen HPCG.

Build HPCG using Spack

Please refer to this link for getting started with spack using AMD Zen Software Studio

    # Example for building HPCG with AOCC
$ spack install hpcg %aocc +openmp ^openmpi fabrics=cma,ucx

Explanation of the command options:

Symbol	Meaning
+openmp	Enables building with OpenMP
%aocc	Build HPCG with AOCC compiler
^openmpi fabrics=cma,ucx	Use OpenMPI as the MPI provider and use the CMA network for efficient intra-node communication, falling back to the UCX network fabric, if required. Note: It is advised to specifically set the appropriate fabric for the host system if possible. Refer to Open MPI with AMD Zen Software Studio for more guidance.

Configuring HPCG Run Parameters

HPCG problem size can be configured through hpcg.dat file or command line arguments.

To produce results that comply with the rules for valid official HPCG runs (e.g. for leaderboard submissions), HPCG runs must be configured to meet the following criteria:

Problem size - (Line 3) This is the size of the local matrix for each rank, therefore for a fixed problem size, memory usage will still scale with the number of MPI ranks. A valid run must execute a problem size that is large enough that data arrays accessed in the CG iteration loop do not fit fully in CPU cache, which would prevent the DRAM latency and bandwidth effects from affecting the problem. HPCG guidelines state that the problem size should be large enough to occupy at least 25% of the main memory.
Run time - (Line 4) HPCG can be run in just a few minutes from start to finish. Our testing suggests 60 seconds is enough to produce consistent results, however, official runs must be at least 1800 seconds (30 minutes) as reported in the output file.

Sample hpcg.dat file for dual socket AMD 5th Gen EPYC™ 9755 Processor with 256 (128x2) cores and using 512 GB of memory.

hpcg.dat

    HPCG benchmark input file
Comment line - may contain any useful string
192 192 192  # dimensions of local (per rank) 3D matrix
1800         # Minimum runtime in seconds

Running HPCG

HPCG is run using mpirun in the standard way for an MPI or MPI+OpenMP application. If you have built HPCG with OpenMP support (recommended for best performance), it should be launched with suitable mapping to ensure that the OpenMP threads of each MPI rank do not span across different CCXs (in an AMD EPYC™ CPU a CCX is a group of cores which share an L3 cache and other memory hardware), as this will degrade performance. Systems should also be configured with SMT (hardware multithreading) switched off.

With OpenMPI this is achieved using a map-by statement. For example, to run HPCG with 2 OpenMP threads per rank on an AMD EPYC™ Gen 2 or later CPU:

Run Script for AMD EPYC™ Processors

    #!/bin/bash
# Loading HPCG built with AOCC
spack load hpcg %aocc

# Group of cores which share an L3 cache (or CCX) is 8 for most EPYC Gen 2-5 CPUs, 4 for EPYC Gen 1
# For frequency optimised "F-parts", check documentation
CORES_PER_L3CACHE=8
NUM_CORES=$(nproc)

# OpenMP Settings
export OMP_PROC_BIND=true
export OMP_PLACES=cores
export OMP_NUM_THREADS=2

# MPI settings
MPI_RANKS=$(( $NUM_CORES / $OMP_NUM_THREADS ))
RANKS_PER_L3CACHE=$(( $CORES_PER_L3CACHE / $OMP_NUM_THREADS ))
MPI_OPTS=“-np $MPI_RANKS --bind-to core --map-by ppr:$RANKS_PER_L3CACHE:l3cache:pe=$OMP_NUM_THREADS ”

# Run HPCG
mpirun $MPI_OPTS xhpcg

Note: Users should update the value of CORES_PER_L3CACHE to match that of the CPU they are using, for example CORES_PER_L3CACHE=4 for AMD EPYC™ Gen 1 (Naples) CPUs. Users of Frequency Optimized AMD EPYC™ CPUs ("F parts") should refer to product documentation to find the appropriate value for their specific CPU SKU.

Note: The above build and run steps are tested with HPCG-3.1, AOCC-5.0.0, and OpenMPI-5.0.5 on Red Hat Enterprise Linux release 8.9 (Ootpa) using Spack v0.23.0.dev0 (commit id : 2da812cbad ).

For technical support on the tools, benchmarks and applications that AMD offers on this page and related inquiries, reach out to us at toolchainsupport@amd.com.

Data Center

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Adaptive SoCs, FPGAs, & SOMs

Graphics

Overview

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

Introduction

Build HPCG using Spack

Configuring HPCG Run Parameters

Running HPCG

Company

News & Events

Resources

Partners

Investors