AMD Optimized CPU Libraries

AMD Zen Software Studio with Spack

Open MPI with AMD Zen Software Studio

Micro Benchmarks/Synthetic Benchmarks

Spack HPC Applications

Introduction

AOCL is a set of numerical libraries optimized for AMD processors based on the AMD “Zen” core architecture and generations. Supported processor families are AMD EPYC™, AMD Ryzen™, and AMD Ryzen™ Threadripper™ processors. The tuned implementations of industry-standard math libraries enable rapid development of scientific and high-performance computing applications.

Official Website: https://www.amd.com/en/developer/aocl.html

The following AOCL libraries are supported with Spack:

amdblis
amdlibflame
amdfftw
amdscalapack
amdlibm
aocl-sparse
aocl-utils
aocl-crypto
aocl-libmem
aocl-compression
aocl-da

Note: Users can install the above libraries individually, or as a bundle using the amd-aocl package.

Spack is designed to automatically resolve library dependencies when installing HPC applications, therefore it is not necessary to explicitly install AMD libraries ahead of time. Instead, to ensure your application is built with all supported AMD Optimized libraries, Spack should be configured to always prefer AMD AOCL libraries.

Preferring AMD AOCL Packages

To configure Spack to select AMD AOCL packages by default for linear algebra and other functions for the specified library version, you need to edit the packages.yaml file.

For example, if you are using the latest version, 5.0, the contents of packages.yaml should include the following directives:

    packages:
  blas:
    require: amdblis@5.0
  flame:
    require: amdlibflame@5.0
  lapack:
    require: amdlibflame@5.0
  fftw-api:
    require: amdfftw@5.0
  scalapack:
    require: amdscalapack@5.0

To edit the packages.yaml use the command spack config edit packages, and see the relevant Spack documentation section for further details.

AMD-AOCL

AMD-AOCL is a bundle package that provides all the above-listed AOCL libraries those are amdblis, amdlibflame, amdfftw, amdscalapack, amdlibm, aocl-sparse, aocl-libmem, aocl-crypto, aocl-compression, and aocl-da as a bundle for easy installation.

Building AMD-AOCL

    $ spack install amd-aocl %aocc

The following is the list of variants available with AMD-AOCL:

Variant (Default)	Allowed Values	Description
openmp	on, off	Enable OpenMP support

AMD BLIS

AOCL-BLIS is a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential kernels of matrix and vector computation and are the most commonly used and computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, for example, AMD EPYC^™, AMD Ryzen™, AMD Ryzen™ Threadripper™ processors by AMD and others.

AMD offers the optimized version of BLIS (AOCL-BLIS) that supports C, FORTRAN, and C++ template interfaces for the BLAS functionalities.

Official Website: https://www.amd.com/en/developer/aocl/dense.html

Building AMD BLIS

    $ spack install amdblis %aocc

The following is the list of variants available with AMD BLIS:

Variant (Default)	Allowed Values	Description
blas	on, off	BLAS Compatibility
cblas	on, off	CBLAS Compatibility
ilp64	on, off	ILP64 Support
libs	shared, static	Build shared libs, static libs, or both
threads	pthreads, openmp, none	Multithreading support
aocl_gemm	on, off	Aocl gemm support
suphandling	on, off	Small Unpacked Kernel handling

AMD LibFLAME

AOCL-libFLAME is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. Applications relying on standard Netlib LAPACK interfaces can utilize libFLAME with virtually no changes to their source code.

Official Website: https://www.amd.com/en/developer/aocl/dense.html#libflame

Building AMD libFLAME

    $ spack install amdlibflame %aocc

The following is the list of variants available with AMD libFLAME:

Variant (Default)	Allowed Values	Description
ilp64	on, off	Build with ILP64 support
lapack2flame	on, off	Map legacy LAPACK routine invocations to their corresponding native C implementations in libflame
shared	on, off	Build shared library
static	on, off	Build static library
threads	pthreads, openmp, none	Multithreading support
enable-aocl-blas	on, off	Enables tight coupling with AOCL-BLAS library to use AOCL-BLAS internal routines
vectorization	none, auto, avx2, avx512	Use hardware vectorization support

AMD FFTW

FFTW is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases thereof, copyrighted by MIT and distributed under the GNU General Public License. An AMD-optimized FFTW(Derived from community FFTW – fftw.org) that includes selective kernels and routines optimized for the AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families is available.

Official Website: https://www.amd.com/en/developer/aocl/fftw.html

Building AMD FFTW

    $ spack install amdfftw %aocc

The following is the list of variants available with AMD FFTW:

Variant (Default)	Allowed Values	Description
amd-top-n-planner	on, off	Build with amd-top-n-planner support
amd-mpi-vader-limit	on, off	Build with amd-mpi-vader-limit support
static	on, off	Build with static support
amd-trans	on, off	Build with amd-trans support
amd-app-opt	on, off	Build with amd-opt support
amd-fast-planner	on, off	Option to reduce the planning time without much tradeoff in the performance. It is supported for float and double precision
amd-dynamic-dispatcher	on, off	Single portable optimized library to execute on different x86 CPU architectures
mpi	on, off	Activate MPI support
openmp	on, off	Enable OpenMP Support
precision	long_double, quad, float, double	Build the selected floating-point precision libraries
shared	on, off	Builds a shared version of the library
threads	on, off	Enable SMP threads support

AMD ScaLAPACK

AOCL-ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It can be used to solve linear systems, least squares problems, eigenvalue problems, and singular value problems. AOCL-ScaLAPACK is optimized for AMD “Zen”-based processors. It depends on the external libraries BLAS and LAPACK; thus, the use of AOCL-BLIS and AOCL-libFLAME is recommended.

Official Website: https://www.amd.com/en/developer/aocl/scalapack.html

Building AMD ScaLAPACK

    $ spack install amdscalapack %aocc

The following is the list of variants available with AMD ScaLAPACK:

Variant (Default)	Allowed Values	Description
ilp64	on, off	Build with ILP64 support

AMD LibM (Math Library)

AOCL-LibM is a software library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions.

AOCL-LibM is a C library, which users can link in to their applications to replace compiler-provided math functions.

Official Website: https://www.amd.com/en/developer/aocl/libm.html

Building AMD LibM

    $ spack install amdlibm %aocc

AOCL-Sparse

AOCL-Sparse contains basic linear algebra subroutines for sparse matrices and vectors optimized for AMD EPYC™, Ryzen™, and Ryzen™ Threadripper™ processor families. It is designed to be used with C and C++. AOCL-Sparse includes sparse solver functions that perform matrix factorization and solution phases.

Official Website: https://www.amd.com/en/developer/aocl/sparse.html

Building AOCL-Sparse

    $ spack install aocl-sparse %aocc

The following is the list of variants available with AOCL-Sparse:

Variant (Default)	Allowed Values	Description
avx	on,off	Enable experimental AVX512
ilp64	on, off	Build with ILP64 support
benchmarks	on, off	Build Benchmarks
examples	on, off	Build sparse examples
shared	on, off	Build shared library
unit_tests	on, off	Build sparse unit tests
openmp	on, off	Enable OpenMP support

AOCL-Utils

AOCL-Utils provides a uniform interface to all the AOCL libraries to access the CPU features for AMD CPUs. This library provides the following features:

Core details
Flags available/usable
ISA available/usable
Topology about L1/L2/L3 caches

AOCL-Utils is designed for integration with the other AOCL libraries. Each project has its own mechanism to identify the CPU and provide necessary features such as Dynamic Dispatch. The main purpose of this library is to provide a centralized mechanism to update/validate and provide information to the users.

Official Website: https://www.amd.com/en/developer/aocl/utils.html

Building AOCL-Utils

      $ spack install aocl-utils %aocc

The following is the list of variants available with AOCL-Utils:

Variant (Default)	Allowed Values	Description
doc	on, off	enable documentation

AOCL-LibMem

AOCL-LibMem is a Linux library for data movement and manipulation functions (such as memcpy and strcpy) highly optimized for AMD Zen micro-architecture.

This library has multiple implementations of each function that can be chosen based on the application requirements as per alignments, instruction choice, threshold values, and tunable parameters.

By default, this library will choose the best-fit implementation based on the underlying micro-architectural support for CPU features and instructions.

This release of the AOCL-LibMem library supports the “standard C library memory handling functions”

Official Website:  https://www.amd.com/en/developer/aocl/libmem.html

Building AOCL-LibMem

    $ spack install aocl-libmem %aocc

The following is the list of variants available with AOCL-LibMem:

Variant (Default)	Allowed Values	Description
vectorization	avx2, avx512, auto	Use hardware vectorization support
shared	on, off	Build shared library
tunables	on, off	Enable/Disable user input
logging	on, off	Enable/Disable logger

AOCL- Crypto

AOCL-Crypto is a library consisting of basic cryptographic functions optimized and tuned for AMD Zen™ based microarchitecture.

This library provides a unified solution for Cryptographic routines such as AES (Advanced Encryption Standard) encryption/decryption routines (CBC, CFB, OFB, CTR, GCM, XTS, CCM, SIV), SHA (Secure Hash Algorithms) routines (SHA2, SHA3, SHAKE), Message Authentication Code (CMAC, HMAC), ECDH (Elliptic-curve Diffie–Hellman) and RSA (Rivest, Shamir, and Adleman) key generation functions, etc. AOCL Crypto supports a dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.

Official Website: https://www.amd.com/en/developer/aocl/cryptography.html

Building AOCL-Crypto

    $ spack install aocl-crypto %aocc

The following is the list of variants available with AOCL-Crypto:

Variant (Default)	Allowed Values	Description
examples	on, off	Build examples
ipp	on, off	Build Intel IPP library

AOCL- Compression

AOCL-Compression is a software framework of various lossless compression and decompression methods tuned and optimized for AMD Zen based CPUs.

This framework offers a single set of unified APIs for all the supported compression and decompression methods which facilitate the applications to easily integrate and use them.

AOCL-Compression supports lz4, zlib/deflate, lzma, zstd, bzip2, snappy, and lz4hc based compression and decompression methods along with their native APIs.

The library offers openMP based multi-threaded implementation of lz4, zlib, zstd and snappy compression methods. It supports the dynamic dispatcher feature that executes the most optimal function variant implemented using Function multi-versioning thereby offering a single optimized library portable across different x86 CPU architectures.

AOCL-Compression framework is developed in C for UNIX® and Windows® based systems. A test suite is provided for the validation and performance benchmarking of the supported compression and decompression methods.

This suite also supports the benchmarking of IPP compression methods, such as, lz4, lz4hc, zlib and bzip2. The library build framework offers CTest based testing of the test cases implemented using GTest and the library test suite.

Official Website:  https://www.amd.com/en/developer/aocl/compression.html

Building AOCL-Compression

    $ spack install aocl-compression %aocc

The following is the list of variants available with AOCL-Compression:

Variant (Default)	Allowed Values	Description
shared	on, off	Build shared library
openmp	on, off	openmp-based multi-threaded compression and decompression
zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc	on, off	By default, these libraries are built, use off to disable any of zlib/bzip2/snappy/zstd/lzma/lz4/lz4hc libraries
decompress_fast	"OFF", "1", "2"	Enable fast decompression modes
enable_fast_math	on, off	Enable fast-math optimizations

AOCL- DA

The AOCL Data Analytics Library (AOCL-DA) is a data analytics library providing optimized building blocks for data analysis. It is written with a C-compatible interface to make it as seamless as possible to integrate with the library from whichever programming language you are using. The intended workflow for using the library is as follows:

load data from memory by reading CSV files or using the in-built da_datastore object
preprocess the data by removing missing values, standardizing, and selecting certain subsets of the data, before extracting contiguous arrays of data from the da_datastore objects
data processing (e.g. principal component analysis, linear model fitting, etc.)

C++ example programs can be found in the examples folder of your installation.

Official Website:  https://www.amd.com/en/developer/aocl/data-analytics.html

Building AOCL-da

    $ spack install aocl-da %aocc

The following is the list of variants available with AOCL-DA:

Variant (Default)	Allowed Values	Description
openmp	on,off	Build using OpenMP and link to threaded BLAS and LAPACK
Python	on,off	Build with Python bindings
ilp64	on,off	Build with ILP64 support
shared	on,off	Build shared libraries
examples	on,off	Build examples
gtest	on,off	Build and install Googletest

データセンター

ビジネスシステム

パーソナル & ゲーミング

エンベデッド

リソース

アクセラレータ

アダプティブ アクセラレータ

DPU アクセラレータ

イーサネット アダプター

ワークステーション

デスクトップ

ノート PC

リソース

アダプティブ SoC & FPGA

システム オン モジュール (SOM)

テクノロジ

開発者リソース

評価ボード & キット

プロセッサ ツール

グラフィックス ツール＆アプリケーション

アダプティブ SoC & FPGA ツール

IP & アプリ

GPU アクセラレータ ツール & アプリケーション

概要

データセンター & クラウド向け

エッジ & エンドポイント向け

開発者向け

業界

業界

業界

業界

Industrias

ワークロード

ゲーミング

システム

テクノロジ

リソース

EPYC プロセッサ

Radeon グラフィックス & AMD チップセット

FPGA & アダプティブ SoC

Alveo アクセラレータ & Kria SOM

Ryzen プロセッサ

イーサネット アダプター

概要

EPYC プロセッサ

アクセラレータ

アダプティブ SoC、FPGA、SOM

グラフィックス

概要

市場セグメント別リソース

製品別リソース

タイプ別リソース

AMD のパートナーについて

AMD グローバル サポート

プロセッサ & グラフィックス

アクセラレータ

アダプティブ SoC & FPGA

AMD 正規販売店から購入

アダプティブ & エンベデッドコンピューティング

Get AMD Fan Gear

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Introduction

Preferring AMD AOCL Packages

AMD-AOCL

Building AMD-AOCL

AMD BLIS

Building AMD BLIS

AMD LibFLAME

Building AMD libFLAME

AMD FFTW

Building AMD FFTW

AMD ScaLAPACK

Building AMD ScaLAPACK

AMD LibM (Math Library)

Building AMD LibM

AOCL-Sparse

アダプティブアクセラレータ

イーサネットアダプター

システムオンモジュール (SOM)

プロセッサツール

グラフィックスツール＆アプリケーション

GPU アクセラレータツール & アプリケーション

イーサネットアダプター

AMD グローバルサポート

ニュース＆イベント