Introduction

CloverLeaf® is a mini-app that solves the compressible Euler equations on a Cartesian grid, using an explicit, second-order accurate method. Each cell stores three values: energy, density, and pressure. A velocity vector is stored at each cell corner. This arrangement of data, with some quantities at cell centers, and others at cell corners is known as a staggered grid.

Official website for CloverLeaf: https://uk-mac.github.io/CloverLeaf/

Build and run instructions on this page are referring to CloverLeaf-ref version. This version contains a hybrid OpenMP/MPI implementation. 

Official website for CloverLeaf-ref: https://github.com/UK-MAC/CloverLeaf_ref

Build CloverLeaf-ref using Spack

Please refer to this link for getting started with spack using AMD Zen Software Studio. 

    # Example for building CloverLeaf-ref v1.3 (hybrid OpenMP + MPI implementation) using AOCC
$ spack install cloverleaf-ref@1.3 %aocc ^openmpi fabrics=cma,ucx

Explanation of the command options

Symbol

Meaning

%aocc

Build CloverLeaf using the AOCC compiler.

@1.3

Using v1.3 version of Cloverleaf-ref.

Cloverleaf-ref has two different versions (i.e., 1.1 and 1.3) and an up-to-date master commit (#0fdb917) which is included in the recipe package.py. 

^openmpi fabrics=cma,ucx

Use OpenMPI as the MPI provider and use the CMA network for efficient intra-node communication, falling back to the UCX network fabric, if required. 

Note: It is advised to specifically set the appropriate fabric for the host system if possible. Refer to Open MPI with AMD Zen Software Studio for more guidance.

 

Running CloverLeaf-ref

For running CloverLeaf-ref, It expects a file called clover.in in the working directory. Sample input files are provided with source distribution ( available at /InputDecks). To run CloverLeaf-ref rename one of the file in /InputDecks as clover.in and copy it into the run directory where CloverLeaf-ref executable is invoked.


Note: Different AMD EPYC CPUs have fewer/more CPUs per l3cache and should use a different MPI/OpenMP layout. Given example has been tested on a dual socket AMD 5th Gen EPYC™ 9755 Processor with 256 (128x2) cores.

Run Script for AMD EPYC™ Processors

    #!/bin/bash
# Load CloverLeaf-ref master build
spack load cloverleaf-ref@1.3 %aocc

# Cores per l3cache (or CCX) is 8 for most EPYC Gen 2-5 CPUs, 4 for EPYC Gen 1
# For frequency optimised "F-parts", check documentation
CORES_PER_L3CACHE=8
NUM_CORES=$(nproc)

# OpenMP Settings
export OMP_NUM_THREADS=${CORES_PER_L3CACHE}  # 8 threads per MPI rank. Recommended OMP_NUM_THREADS= #cores per L3 cache
export OMP_PROC_BIND=TRUE     # bind threads to specific resources
export OMP_PLACES="cores"     # bind threads to cores

# MPI settings suggested for a dual socket AMD 5th Gen EPYC™ 9755 Processor with 256 (128x2) cores.
MPI_RANKS=$(( $NUM_CORES / $OMP_NUM_THREADS ))
RANKS_PER_L3CACHE=$(( $CORES_PER_L3CACHE / $OMP_NUM_THREADS ))
MPI_OPTS="-np $MPI_RANKS --map-by ppr:$RANKS_PER_L3CACHE:l3cache:pe=$OMP_NUM_THREADS "

# Run CloverLeaf-ref
mpirun $MPI_OPTS clover_leaf

Note: The above build and run steps are tested with CloverLeaf-ref-1.3, AOCC-5.0.0, and OpenMPI-5.0.5 on Red Hat Enterprise Linux release 8.9 (Ootpa) using Spack v0.23.0.dev0 (commit id : 2da812cbad ).

For technical support on the tools, benchmarks and applications that AMD offers on this page and related inquiries, reach out to us at toolchainsupport@amd.com