AMD Zen Software Studio with Spack
- AMD Optimizing C/C++ Compiler (AOCC)
- AMD Optimizing CPU Libraries (AOCL)
- AMD uProf
- Setting Preference for AMD Zen Software Studio
Open MPI with AMD Zen Software Studio
Micro Benchmarks/Synthetic Benchmarks
Spack HPC Applications
Introduction
NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulation. NAMD was the first application able to perform a full all-atom simulation of a virus in 2006, and in 2012 a molecular dynamics flexible fitting interaction of an HIV virus capsid in its tabular form.
Official website for NAMD: http://www.ks.uiuc.edu/Research/namd/
NAMD version 2.15a2 and later has AVXTiles Support
Getting NAMD Source Files
Spack does not currently support automatically downloading NAMD source tar files. Please refer NAMD download page for downloading the source tar files manually. After download store them in the Spack parent directory.
For NAMD version 2.15alpha*, rename the downloaded source tar file to match naming convention used in spack recipe.
- Spack expects a format like "NAMD_2.15a2_Source.tar.gz" instead of default downloaded file "NAMD_2.15alpha2_Source-AVX512.tar.gz"
Build NAMD using Spack
Please refer to this link for getting started with spack using AMD Zen Software Studio
# Example for building NAMD with avxtiles support using AOCC and AOCL
$ spack install namd %aocc +avxtiles fftw=amdfftw interface=tcl ^amdfftw ^charmpp backend=mpi build-target=charm++ ^openmpi fabrics=cma,ucx
Explanation of the command options:
Symbol | Meaning |
---|---|
%aocc | Build NAMD with AOCC compiler. |
+avxtiles | To add support for AVXTiles algorithm, will be valid for NAMD-v2.15a1 and later on systems supporting AVX512 instruction set. |
fftw=amdfftw | Use amdfftw as the FFTW implementation. |
interface=tcl | Use tcl as the interface. |
^charmpp backend=mpi build-target=charm++ | To build NAMD with charmpp, where charmpp uses mpi backend and charm++ target. |
^openmpi fabrics=cma,ucx | Use OpenMPI as the MPI provider and use the CMA network for efficient intra-node communication, falling back to the UCX network fabric, if required. Note: It is advised to specifically set the appropriate fabric for the host system if possible. Refer to Open MPI with AMD Zen Software Studio for more guidance. |
Running NAMD
STMV benchmark used in this example can be found at: https://www.ks.uiuc.edu/Research/namd/utilities/
Obtaining Benchmarks
# Download STMV dataset
$ wget https://www.ks.uiuc.edu/Research/namd/utilities/stmv/par_all27_prot_na.inp
$ wget https://www.ks.uiuc.edu/Research/namd/utilities/stmv/stmv.namd
$ wget https://www.ks.uiuc.edu/Research/namd/utilities/stmv/stmv.pdb.gz
$ wget https://www.ks.uiuc.edu/Research/namd/utilities/stmv/stmv.psf.gz
# Uncompress and edit input files to use current directory for writing temporary files
$ gunzip stmv.psf.gz
$ gunzip stmv.pdb.gz
$ sed -i 's/\/usr/./g' stmv.namd
Process layout for NAMD run:
NAMD needs one communication thread per set of worker threads, for example while running on system with dual socket AMD 5th Gen EPYC™ 9755 Processor with 256 (128x2) cores.
- Running with one communication thread and 7 worker threads (+ppn is used to specify the number of worker threads per rank) for each rank where each rank (a group of 8 threads) will be assigned to the same l3cache
- Lay out the communication thread on the first core for each MPI rank (+commap is used to specify mapping of communication threads)
- The worker threads are then pinned on next 7 cores for each MPI rank (+pemap is used to specify mapping of worker threads)
Run Script for AMD EPYC™ Processors
#!/bin/bash
# Loading NAMD build with AOCC
spack load namd %aocc
# Capture the system settings and set MPI and OMP options
CORES_PER_L3CACHE=8
NUM_CORES=$(nproc)
export OMP_NUM_THREADS=${CORES_PER_L3CACHE} # 8 threads per MPI rank. Recommended OMP_NUM_THREADS= #cores per L3 cache
MPI_RANKS=$(( $NUM_CORES / $OMP_NUM_THREADS ))
MPI_OPTS=”-np $MPI_RANKS --bind-to core”
# NAMD process layout, it is recommended to map each MPI rank to a l3cache, where each rank will
# use 8 threads (1 communication thread and 7 worker thread). This will result in 32 MPI ranks on
# dual socket AMD 5th Gen EPYC™ 9755 Processor with 256 (128x2) cores
NAMD_layout= " +ppn `expr ${OMP_NUM_THREADS} - 1` \
+commap 0-${NUM_CORES}:${OMP_NUM_THREADS} \
+pemap 1-`expr ${NUM_CORES} - 1`:${OMP_NUM_THREADS}.`expr ${OMP_NUM_THREADS} - 1`"
# Run command for STMV dataset with NAMD
mkdir -p ./tmp
mpirun $MPI_OPTS namd3 $NAMD_layout stmv.namd
Note: The above build and run steps are tested with NAMD-3, AOCC-5.0.0, AOCL-5.0.0, and OpenMPI-5.0.5 on Red Hat Enterprise Linux release 8.9 (Ootpa) using Spack v0.23.0.dev0 (commit id : 2da812cbad ).
For technical support on the tools, benchmarks and applications that AMD offers on this page and related inquiries, reach out to us at toolchainsupport@amd.com.