AMD Zen Software Studio with Spack
- AMD Optimizing C/C++ Compiler (AOCC)
- AMD Optimizing CPU Libraries (AOCL)
- AMD uProf
- Setting Preference for AMD Zen Software Studio
Open MPI with AMD Zen Software Studio
Micro Benchmarks/Synthetic Benchmarks
Spack HPC Applications
Introduction
AOCL is a set of numerical libraries optimized for AMD processors based on the AMD “Zen” core architecture and generations. Supported processor families are AMD EPYC™, AMD Ryzen™, and AMD Ryzen™ Threadripper™ processors. The tuned implementations of industry-standard math libraries enable rapid development of scientific and high-performance computing applications.
Official Website: https://www.amd.com/en/developer/aocl.html
The following AOCL libraries are supported with Spack:
- amdblis
- amdlibflame
- amdfftw
- amdscalapack
- amdlibm
- aocl-sparse
- aocl-utils
- aocl-crypto
- aocl-libmem
- aocl-compression
- aocl-da
Note: Users can install the above libraries individually, or as a bundle using the amd-aocl package.
Spack is designed to automatically resolve library dependencies when installing HPC applications, therefore it is not necessary to explicitly install AMD libraries ahead of time. Instead, to ensure your application is built with all supported AMD Optimized libraries, Spack should be configured to always prefer AMD AOCL libraries.
Preferring AMD AOCL Packages
To configure Spack to select AMD AOCL packages by default for linear algebra and other functions for the specified library version, you need to edit the packages.yaml file.
For example, if you are using the latest version, 5.0, the contents of packages.yaml should include the following directives:
packages:
blas:
require: amdblis@5.0
flame:
require: amdlibflame@5.0
lapack:
require: amdlibflame@5.0
fftw-api:
require: amdfftw@5.0
scalapack:
require: amdscalapack@5.0
To edit the packages.yaml use the command spack config edit packages, and see the relevant Spack documentation section for further details.
AMD-AOCL
AMD-AOCL is a bundle package that provides all the above-listed AOCL libraries those are amdblis, amdlibflame, amdfftw, amdscalapack, amdlibm, aocl-sparse, aocl-libmem, aocl-crypto, aocl-compression, and aocl-da as a bundle for easy installation.
Building AMD-AOCL
$ spack install amd-aocl %aocc
The following is the list of variants available with AMD-AOCL:
Variant (Default) | Allowed Values | Description |
---|---|---|
openmp | on, off | Enable OpenMP support |
AMD BLIS
AOCL-BLIS is a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential kernels of matrix and vector computation and are the most commonly used and computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, for example, AMD EPYC™, AMD Ryzen™, AMD Ryzen™ Threadripper™ processors by AMD and others.
AMD offers the optimized version of BLIS (AOCL-BLIS) that supports C, FORTRAN, and C++ template interfaces for the BLAS functionalities.
Official Website: https://www.amd.com/en/developer/aocl/dense.html
Building AMD BLIS
$ spack install amdblis %aocc
The following is the list of variants available with AMD BLIS:
Variant (Default) | Allowed Values | Description |
---|---|---|
blas | on, off | BLAS Compatibility |
cblas | on, off | CBLAS Compatibility |
ilp64 | on, off | ILP64 Support |
libs | shared, static | Build shared libs, static libs, or both |
threads | pthreads, openmp, none | Multithreading support |
aocl_gemm | on, off | Aocl gemm support |
suphandling | on, off | Small Unpacked Kernel handling |
AMD LibFLAME
AOCL-libFLAME is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. Applications relying on standard Netlib LAPACK interfaces can utilize libFLAME with virtually no changes to their source code.
Official Website: https://www.amd.com/en/developer/aocl/dense.html#libflame
Building AMD libFLAME
$ spack install amdlibflame %aocc
The following is the list of variants available with AMD libFLAME:
Variant (Default) | Allowed Values | Description |
---|---|---|
ilp64 | on, off | Build with ILP64 support |
lapack2flame | on, off | Map legacy LAPACK routine invocations to their corresponding native C implementations in libflame |
shared | on, off | Build shared library |
static | on, off | Build static library |
threads | pthreads, openmp, none | Multithreading support |
enable-aocl-blas | on, off | Enables tight coupling with AOCL-BLAS library to use AOCL-BLAS internal routines |
vectorization | none, auto, avx2, avx512 | Use hardware vectorization support |
AMD FFTW
FFTW is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases thereof, copyrighted by MIT and distributed under the GNU General Public License. An AMD-optimized FFTW(Derived from community FFTW – fftw.org) that includes selective kernels and routines optimized for the AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families is available.
Official Website: https://www.amd.com/en/developer/aocl/fftw.html
Building AMD FFTW
$ spack install amdfftw %aocc
The following is the list of variants available with AMD FFTW:
Variant (Default) | Allowed Values | Description |
---|---|---|
amd-top-n-planner | on, off | Build with amd-top-n-planner support |
amd-mpi-vader-limit | on, off | Build with amd-mpi-vader-limit support |
static | on, off | Build with static support |
amd-trans | on, off | Build with amd-trans support |
amd-app-opt | on, off | Build with amd-opt support |
amd-fast-planner | on, off | Option to reduce the planning time without much tradeoff in the performance. It is supported for float and double precision |
amd-dynamic-dispatcher | on, off | Single portable optimized library to execute on different x86 CPU architectures |
mpi | on, off | Activate MPI support |
openmp | on, off | Enable OpenMP Support |
precision | long_double, quad, float, double | Build the selected floating-point precision libraries |
shared | on, off | Builds a shared version of the library |
threads | on, off | Enable SMP threads support |
AMD ScaLAPACK
AOCL-ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It can be used to solve linear systems, least squares problems, eigenvalue problems, and singular value problems. AOCL-ScaLAPACK is optimized for AMD “Zen”-based processors. It depends on the external libraries BLAS and LAPACK; thus, the use of AOCL-BLIS and AOCL-libFLAME is recommended.
Official Website: https://www.amd.com/en/developer/aocl/scalapack.html
Building AMD ScaLAPACK
$ spack install amdscalapack %aocc
The following is the list of variants available with AMD ScaLAPACK:
Variant (Default) | Allowed Values | Description |
---|---|---|
ilp64 | on, off | Build with ILP64 support |
AMD LibM (Math Library)
AOCL-LibM is a software library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions.
AOCL-LibM is a C library, which users can link in to their applications to replace compiler-provided math functions.
Official Website: https://www.amd.com/en/developer/aocl/libm.html
Building AMD LibM
$ spack install amdlibm %aocc
AOCL-Sparse
AOCL-Sparse contains basic linear algebra subroutines for sparse matrices and vectors optimized for AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families. It is designed to be used with C and C++. AOCL-Sparse includes sparse solver functions that perform matrix factorization and solution phases.
Official Website: https://www.amd.com/en/developer/aocl/sparse.html
Building AOCL-Sparse
$ spack install aocl-sparse %aocc
The following is the list of variants available with AOCL-Sparse:
Variant (Default) | Allowed Values | Description |
---|---|---|
avx | on,off | Enable experimental AVX512 |
ilp64 | on, off | Build with ILP64 support |
benchmarks | on, off | Build Benchmarks |
examples | on, off | Build sparse examples |
shared | on, off | Build shared library |
unit_tests | on, off | Build sparse unit tests |
openmp | on, off | Enable OpenMP support |
AOCL-Utils
AOCL-Utils provides a uniform interface to all the AOCL libraries to access the CPU features for AMD CPUs. This library provides the following features:
- Core details
- Flags available/usable
- ISA available/usable
- Topology about L1/L2/L3 caches
AOCL-Utils is designed for integration with the other AOCL libraries. Each project has its own mechanism to identify the CPU and provide necessary features such as Dynamic Dispatch. The main purpose of this library is to provide a centralized mechanism to update/validate and provide information to the users.
Official Website: https://www.amd.com/en/developer/aocl/utils.html
Building AOCL-Utils
$ spack install aocl-utils %aocc
The following is the list of variants available with AOCL-Utils:
Variant (Default) | Allowed Values | Description |
doc | on, off | enable documentation |
AOCL-LibMem
AOCL-LibMem is a Linux library for data movement and manipulation functions (such as memcpy and strcpy) highly optimized for AMD Zen micro-architecture.
This library has multiple implementations of each function that can be chosen based on the application requirements as per alignments, instruction choice, threshold values, and tunable parameters.
By default, this library will choose the best-fit implementation based on the underlying micro-architectural support for CPU features and instructions.
This release of the AOCL-LibMem library supports the “standard C library memory handling functions”
Official Website: https://www.amd.com/en/developer/aocl/libmem.html
Building AOCL-LibMem
$ spack install aocl-libmem %aocc
The following is the list of variants available with AOCL-LibMem:
Variant (Default) | Allowed Values | Description |
vectorization | avx2, avx512, auto | Use hardware vectorization support |
shared | on, off | Build shared library |
tunables | on, off | Enable/Disable user input |
logging | on, off | Enable/Disable logger |
AOCL- Crypto
AOCL-Crypto is a library consisting of basic cryptographic functions optimized and tuned for AMD Zen™ based microarchitecture.
This library provides a unified solution for Cryptographic routines such as AES (Advanced Encryption Standard) encryption/decryption routines (CBC, CFB, OFB, CTR, GCM, XTS, CCM, SIV), SHA (Secure Hash Algorithms) routines (SHA2, SHA3, SHAKE), Message Authentication Code (CMAC, HMAC), ECDH (Elliptic-curve Diffie–Hellman) and RSA (Rivest, Shamir, and Adleman) key generation functions, etc. AOCL Crypto supports a dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.
Official Website: https://www.amd.com/en/developer/aocl/cryptography.html
Building AOCL-Crypto
$ spack install aocl-crypto %aocc
The following is the list of variants available with AOCL-Crypto:
Variant (Default) | Allowed Values | Description |
---|---|---|
examples | on, off | Build examples |
ipp | on, off | Build Intel IPP library |
AOCL- Compression
AOCL-Compression is a software framework of various lossless compression and decompression methods tuned and optimized for AMD Zen based CPUs.
This framework offers a single set of unified APIs for all the supported compression and decompression methods which facilitate the applications to easily integrate and use them.
AOCL-Compression supports lz4, zlib/deflate, lzma, zstd, bzip2, snappy, and lz4hc based compression and decompression methods along with their native APIs.
The library offers openMP based multi-threaded implementation of lz4, zlib, zstd and snappy compression methods. It supports the dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.
AOCL-Compression framework is developed in C for UNIX® and Windows® based systems. A test suite is provided for the validation and performance benchmarking of the supported compression and decompression methods.
This suite also supports the benchmarking of IPP compression methods, such as, lz4, lz4hc, zlib and bzip2. The library build framework offers CTest based testing of the test cases implemented using GTest and the library test suite.
Official Website: https://www.amd.com/en/developer/aocl/compression.html
Building AOCL-Compression
$ spack install aocl-compression %aocc
The following is the list of variants available with AOCL-Compression:
Variant (Default) | Allowed Values | Description |
---|---|---|
shared | on, off | Build shared library |
openmp | on, off | openmp-based multi-threaded compression and decompression |
zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc | on, off | By default, these libraries are built, use off to disable any of zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc libraries |
decompress_fast | "OFF", "1", "2" | Enable fast decompression modes |
enable_fast_math | on, off | Enable fast-math optimizations |
AOCL- DA
The AOCL Data Analytics Library (AOCL-DA) is a data analytics library providing optimized building blocks for data analysis. It is written with a C-compatible interface to make it as seamless as possible to integrate with the library from whichever programming language you are using. The intended workflow for using the library is as follows:
- load data from memory by reading CSV files or using the in-built da_datastore object
- preprocess the data by removing missing values, standardizing, and selecting certain subsets of the data, before extracting contiguous arrays of data from the da_datastore objects
- data processing (e.g. principal component analysis, linear model fitting, etc.)
C++ example programs can be found in the examples folder of your installation.
Official Website: https://www.amd.com/en/developer/aocl/data-analytics.html
Building AOCL-da
$ spack install aocl-da %aocc
The following is the list of variants available with AOCL-DA:
Variant (Default) | Allowed Values | Description |
openmp | on,off | Build using OpenMP and link to threaded BLAS and LAPACK |
Python | on,off | Build with Python bindings |
ilp64 | on,off | Build with ILP64 support |
shared | on,off | Build shared libraries |
examples | on,off | Build examples |
gtest | on,off | Build and install Googletest |