AOCL-BLIS

AOCL-BLIS is a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential ke4rnels of matrix and vector computation and are the most commonly used and computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, for example, AMD EPYCTM, AMD RyzenTM, AMD RyzenTM ThreadripperTM processors by AMD and others.

AMD offers the optimized version of BLIS (AOCL-BLIS) that supports C, FORTRAN, and C++ template interfaces for the BLAS functionalities.

Highlights of AOCL-BLIS 4.0

  • The following LPGEMM variants are added along with post-ops support:
    • aocl_gemm_u8s8s32o32 AVX-512-VNNI optimized
    • aocl_gemm_u8s8s16o16 AVX2 optimized
    • aocl_gemm_bf16bf16bf16 and aocl_gemm_bf16bf16f32 AVX-512 optimized
  • SGEMM with packed/reorder buffer support (aocl_gemm_f32f32f32f32)
  • AMD “Zen4” support for BLIS
  • Dynamic dispatch supports AMD “Zen4” configuration
  • Optimizations and performance improvements for DGEMM, SGEMM, ZGEMM, DGEMMT, and DTRSM
  • Framework design changes

The package containing AOCL-BLIS Library binaries that includes optimizations for AMD processors, examples and documentation are available in the Download section below.

Source code for AOCL-BLIS will be available shortly on GitHub (https://github.com/amd/blis).

AOCL-libFLAME

AOCL-libFLAME is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. libFLAME is a C-only implementation. Applications relying on standard Netlib LAPACK interfaces can utilize libFLAME with virtually no changes to their source code.

From AOCL 4.0, AMD optimized version of libFLAME(AOCL-libFLAME) is compatible with LAPACK 3.10.1 specification. In combination with the AOCL-BLIS library, which includes optimizations for the AMD “Zen”-based processors, libFLAME enables running high performing LAPACK functionalities on AMD platforms. AOCL-libFLAME supports C, FORTRAN, and C++ template interfaces (for a subset of APIs) for the LAPACK APIs.

Highlights of AOCL-libFLAME 4.0

  • Upgrade to LAPACK 3.10.1 specification that includes several bug fixes from Netlib LAPACK
  • Improved performance of the following APIs:
    • Eigen Value routine (ZGGEV)
    • SVD routines (DGESDD, CGESDD, and ZGESDD)
  • Logging feature supports timing for real double precision libFLAME APIs
  • AOCL-Progress feature that provides progress update on API computations running for a long time is extended for more APIs: {S/C/Z}GETRF, {S/D}POTRF,{S/D}GEQRF, {S/C/D/Z}GBTRF

The packages containing AOCL-libFLAME binaries, examples and documentation are available in the Download section below.

Source code for AOCL-libFLAME will be available shortly on GitHub (https://github.com/amd/libflame).

For prior versions of AOCL-BLIS and AOCL-libFLAME, refer to BLAS Library Archive.

Download:

File Name Version Size Launch Date OS Bitness Description
Binary packages compiled with AOCC 4.0
aocl-blis-linux-aocc-4.0.tar.gz 4.0 25 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit AOCC compiled AOCL-BLIS library binary package sha256 Checksum: d12b4dbb55598e7eb746d25cfc4e3417927619a4c522c5771208154dd21a4391
aocl-libflame-linux-aocc-4.0.tar.gz 4.0 34 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit AOCC compiled AOCL-libFLAME Library binary package sha256 Checksum: 094021a92a3fce5c10eebe09ead85df983df876beb44d1bbb6223fc3a70ee8d1
aocl-blis-hpl-mt-aocc-avx2-4.0.0.tar.gz 4.0 18.9 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit AOCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 85b2a1cecf34376662f5b2826a15f5a378e520e839286e30a43d8c427c2367e5
aocl-blis-hpl-mt-aocc-avx512-4.0.0.tar.gz 4.0 19.9 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit AOCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 6cf15b59101b99536354c45f543c94ba9e7373dbb70f917adeac81d6d48994e8
Binary packages compiled with GCC 11.2
aocl-blis-linux-gcc-4.0.tar.gz 4.0 28 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit GCC compiled AOCL-BLIS library binary package sha256 Checksum: 5a3e67bfa504c2a8cb2a6e1d2bed017e9487ceb22ca5b3f367f084d6f73d0137
aocl-libflame-linux-gcc-4.0.tar.gz 4.0 36 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit GCC compiled AOCL-libFLAME Library binary package sha256 Checksum: 01b587be9e8bea873a6f93b8150ec080d9fa11e3edcba8c33efbaf7f4d7ebae7
aocl-blis-hpl-mt-gcc-avx2-4.0.0.tar.gz 4.0 32.5 MB 11/10/2022 Ubuntu, SLES, CentOS, and RHEL 64-bit GCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 297ab8eaa073826501eff07a3061ddc22ab7d7b50d55a6dc5b678554fe39772b