Introduction

AOCL is a set of numerical libraries optimized for AMD processors based on the AMD “Zen” core architecture and generations. Supported processor families are AMD EPYC™, AMD Ryzen™, and AMD Ryzen™ Threadripper™ processors. The tuned implementations of industry-standard math libraries enable rapid development of scientific and high-performance computing applications.

Official Website: https://www.amd.com/en/developer/aocl.html

The following AOCL libraries are supported with Spack:

  • amdblis
  • amdlibflame
  • amdfftw
  • amdscalapack
  • amdlibm
  • aocl-sparse
  • aocl-utils
  • aocl-crypto
  • aocl-libmem
  • aocl-compression
  • aocl-da

Note: Users can install the above libraries individually, or as a bundle using the amd-aocl package.

Spack is designed to automatically resolve library dependencies when installing HPC applications, therefore it is not necessary to explicitly install AMD libraries ahead of time.  Instead, to ensure your application is built with all supported AMD Optimized libraries, Spack should be configured to always prefer AMD AOCL libraries.

Preferring AMD AOCL Packages

To configure Spack to select AMD AOCL packages by default for linear algebra and other functions for the specified library version, you need to edit the packages.yaml file.

For example, if you are using the latest version, 5.0, the contents of packages.yaml should include the following directives:

    packages:
  blas:
    require: amdblis@5.0
  flame:
    require: amdlibflame@5.0
  lapack:
    require: amdlibflame@5.0
  fftw-api:
    require: amdfftw@5.0
  scalapack:
    require: amdscalapack@5.0

To edit the packages.yaml use the command spack config edit packages, and see the relevant Spack documentation section for further details.

 

AMD-AOCL

AMD-AOCL is a bundle package that provides all the above-listed AOCL libraries those are amdblis, amdlibflame, amdfftw, amdscalapack, amdlibm, aocl-sparse, aocl-libmem, aocl-crypto, aocl-compression, and aocl-da as a bundle for easy installation.

Building AMD-AOCL

    $ spack install amd-aocl %aocc

The following is the list of variants available with AMD-AOCL:

Variant (Default) Allowed Values Description
openmp on, off Enable OpenMP support

AMD BLIS

AOCL-BLIS is a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential kernels of matrix and vector computation and are the most commonly used and computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, for example, AMD EPYC, AMD Ryzen™, AMD Ryzen™ Threadripper™ processors by AMD and others.

AMD offers the optimized version of BLIS (AOCL-BLIS) that supports C, FORTRAN, and C++ template interfaces for the BLAS functionalities.

Official Website: https://www.amd.com/en/developer/aocl/dense.html

 

Building AMD BLIS

    $ spack install amdblis %aocc

The following is the list of variants available with AMD BLIS:

Variant (Default) Allowed Values Description
blas on, off BLAS Compatibility
cblas on, off CBLAS Compatibility
ilp64 on, off ILP64 Support 
libs shared, static Build shared libs, static libs, or both
threads pthreads, openmp, none Multithreading support
aocl_gemm on, off Aocl gemm support
suphandling on, off Small Unpacked Kernel handling

AMD LibFLAME

AOCL-libFLAME is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. Applications relying on standard Netlib LAPACK interfaces can utilize libFLAME with virtually no changes to their source code.

Official Website: https://www.amd.com/en/developer/aocl/dense.html#libflame

 

Building AMD libFLAME

    $ spack install amdlibflame %aocc

The following is the list of variants available with AMD libFLAME:

Variant (Default) Allowed Values Description
ilp64 on, off Build with ILP64 support
lapack2flame on, off Map legacy LAPACK routine invocations to their corresponding native C implementations in libflame
shared on, off Build shared library
static on, off Build static library
threads pthreads, openmp, none Multithreading support
enable-aocl-blas on, off Enables tight coupling with AOCL-BLAS library to use AOCL-BLAS internal routines
vectorization none, auto, avx2, avx512 Use hardware vectorization support

AMD FFTW

FFTW is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases thereof, copyrighted by MIT and distributed under the GNU General Public License. An AMD-optimized FFTW(Derived from community FFTW – fftw.org) that includes selective kernels and routines optimized for the AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families is available.

Official Website: https://www.amd.com/en/developer/aocl/fftw.html

 

Building AMD FFTW

    $ spack install amdfftw %aocc

The following is the list of variants available with AMD FFTW:

Variant (Default) Allowed Values Description
amd-top-n-planner on, off Build with amd-top-n-planner support
amd-mpi-vader-limit on, off Build with amd-mpi-vader-limit support
static on, off Build with static support
amd-trans on, off Build with amd-trans support
amd-app-opt on, off Build with amd-opt support
amd-fast-planner on, off Option to reduce the planning time without much tradeoff in the performance. It is supported for float and double precision
amd-dynamic-dispatcher on, off Single portable optimized library to execute on different x86 CPU architectures
mpi on, off Activate MPI support
openmp on, off Enable OpenMP Support
precision long_double, quad, float, double Build the selected floating-point precision libraries
shared on, off Builds a shared version of the library
threads on, off Enable SMP threads support

AMD ScaLAPACK

AOCL-ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It can be used to solve linear systems, least squares problems, eigenvalue problems, and singular value problems. AOCL-ScaLAPACK is optimized for AMD “Zen”-based processors. It depends on the external libraries BLAS and LAPACK; thus, the use of AOCL-BLIS and AOCL-libFLAME is recommended.

Official Website: https://www.amd.com/en/developer/aocl/scalapack.html

 

Building AMD ScaLAPACK

    $ spack install amdscalapack %aocc

The following is the list of variants available with AMD ScaLAPACK:

Variant (Default) Allowed Values Description
ilp64 on, off Build with ILP64 support

AMD LibM (Math Library)

AOCL-LibM is a software library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions.

AOCL-LibM is a C library, which users can link in to their applications to replace compiler-provided math functions. 

Official Website: https://www.amd.com/en/developer/aocl/libm.html

 

Building AMD LibM

    $ spack install amdlibm %aocc

 

AOCL-Sparse

AOCL-Sparse contains basic linear algebra subroutines for sparse matrices and vectors optimized for AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families. It is designed to be used with C and C++. AOCL-Sparse includes sparse solver functions that perform matrix factorization and solution phases.

Official Website: https://www.amd.com/en/developer/aocl/sparse.html

 

Building AOCL-Sparse

    $ spack install aocl-sparse %aocc

The following is the list of variants available with AOCL-Sparse:

Variant (Default) Allowed Values Description
avx on,off Enable experimental AVX512
ilp64 on, off Build with ILP64 support
benchmarks on, off Build Benchmarks
examples on, off Build sparse examples
shared on, off Build shared library
unit_tests on, off Build sparse unit tests
openmp on, off Enable OpenMP support

AOCL-Utils

AOCL-Utils provides a uniform interface to all the AOCL libraries to access the CPU features for AMD CPUs. This library provides the following features: 

  • Core details 
  • Flags available/usable 
  • ISA available/usable 
  • Topology about L1/L2/L3 caches 

AOCL-Utils is designed for integration with the other AOCL libraries. Each project has its own mechanism to identify the CPU and provide necessary features such as Dynamic Dispatch. The main purpose of this library is to provide a centralized mechanism to update/validate and provide information to the users. 

Official Website:https://www.amd.com/en/developer/aocl/utils.html

 

Building AOCL-Utils 

      $ spack install aocl-utils %aocc

The following is the list of variants available with AOCL-Utils:

Variant (Default) Allowed Values Description
doc  on, off  enable documentation


AOCL-LibMem

AOCL-LibMem is a Linux library for data movement and manipulation functions (such as memcpy and strcpy) highly optimized for AMD Zen micro-architecture.

This library has multiple implementations of each function that can be chosen based on the application requirements as per alignments, instruction choice, threshold values, and tunable parameters.

By default, this library will choose the best-fit implementation based on the underlying micro-architectural support for CPU features and instructions.

This release of the AOCL-LibMem library supports the “standard C library memory handling functions”

Official Website:  https://www.amd.com/en/developer/aocl/libmem.html

 

Building AOCL-LibMem

    $ spack install aocl-libmem %aocc

The following is the list of variants available with AOCL-LibMem:

Variant (Default) Allowed Values Description
vectorization avx2, avx512, auto Use hardware vectorization support
shared on, off  Build shared library
tunables on, off  Enable/Disable user input
logging on, off  Enable/Disable logger


AOCL- Crypto

AOCL-Crypto is a library consisting of basic cryptographic functions optimized and tuned for AMD Zen™ based microarchitecture.

This library provides a unified solution for Cryptographic routines such as AES (Advanced Encryption Standard) encryption/decryption routines (CBC, CFB, OFB, CTR, GCM, XTS, CCM, SIV), SHA (Secure Hash Algorithms) routines (SHA2, SHA3, SHAKE), Message Authentication Code (CMAC, HMAC), ECDH (Elliptic-curve Diffie–Hellman) and RSA (Rivest, Shamir, and Adleman) key generation functions, etc. AOCL Crypto supports a dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.

Official Website: https://www.amd.com/en/developer/aocl/cryptography.html

 

Building AOCL-Crypto

    $ spack install aocl-crypto %aocc

The following is the list of variants available with AOCL-Crypto:

Variant (Default) Allowed Values Description
examples on, off Build examples
ipp on, off Build Intel IPP library


AOCL- Compression

AOCL-Compression is a software framework of various lossless compression and decompression methods tuned and optimized for AMD Zen based CPUs.

This framework offers a single set of unified APIs for all the supported compression and decompression methods which facilitate the applications to easily integrate and use them.

AOCL-Compression supports lz4, zlib/deflate, lzma, zstd, bzip2, snappy, and lz4hc based compression and decompression methods along with their native APIs.

The library offers openMP based multi-threaded implementation of lz4, zlib, zstd and snappy compression methods. It supports the dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.

AOCL-Compression framework is developed in C for UNIX® and Windows® based systems. A test suite is provided for the validation and performance benchmarking of the supported compression and decompression methods.

This suite also supports the benchmarking of IPP compression methods, such as, lz4, lz4hc, zlib and bzip2. The library build framework offers CTest based testing of the test cases implemented using GTest and the library test suite.

Official Website:  https://www.amd.com/en/developer/aocl/compression.html

 

Building AOCL-Compression

    $ spack install aocl-compression %aocc

The following is the list of variants available with AOCL-Compression:

Variant (Default) Allowed Values Description
shared on, off Build shared library
openmp on, off openmp-based multi-threaded compression and decompression
zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc on, off By default, these libraries are built, use off to disable any of zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc libraries
decompress_fast "OFF", "1", "2" Enable fast decompression modes
enable_fast_math on, off Enable fast-math optimizations

 

AOCL- DA

The AOCL Data Analytics Library (AOCL-DA) is a data analytics library providing optimized building blocks for data analysis. It is written with a C-compatible interface to make it as seamless as possible to integrate with the library from whichever programming language you are using. The intended workflow for using the library is as follows:

  • load data from memory by reading CSV files or using the in-built da_datastore object
  • preprocess the data by removing missing values, standardizing, and selecting certain subsets of the data, before extracting contiguous arrays of data from the da_datastore objects
  • data processing (e.g. principal component analysis, linear model fitting, etc.)

C++ example programs can be found in the examples folder of your installation.

Official Website:  https://www.amd.com/en/developer/aocl/data-analytics.html

 

Building AOCL-da

    $ spack install aocl-da %aocc

The following is the list of variants available with AOCL-DA:

Variant (Default) Allowed Values Description
openmp on,off Build using OpenMP and link to threaded BLAS and LAPACK
Python on,off Build with Python bindings
ilp64 on,off Build with ILP64 support
shared on,off Build shared libraries
examples on,off Build examples
gtest on,off Build and install Googletest