AOCL-BLAS

AOCL-BLAS provides a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential kernels of matrix and vector computation and are the most used computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, including AMD EPYC™, AMD Ryzen™, and AMD Ryzen™ Threadripper™ processors.

AOCL-BLAS is developed as a forked version of BLIS (https://github.com/flame/blis), which is developed by members of the Science of High-Performance Computing (SHPC) group in the Institute for Computational Engineering and Sciences at The University of Texas at Austin and other collaborators (including AMD). All known features and functionalities of BLIS are retained and supported in the AOCL-BLAS library, along with the standard BLAS and CBLAS interfaces. C++ template interfaces for BLAS functionalities are also included.

Highlights of AOCL-BLAS 5.0

  • Turin specific tuning for the following APIs:
    • D/ZGEMM, DTRSM and DNRM2
  • AVX512 made improvements for the following APIs:
    • ZGEMV, D/ZAXPYF, D/ZDOTXF, ZDOTV, C/ZSCALV, DNRM2, S/D/ZCOPY, S/D/C/ZAXPBYV, DTRSV, DGEMMT, D/ZTRSM, and D/ZGEMM
  • Improvements to the AOCL_ENABLE_INSTRUCTIONS functionality
  • Additional APIs and Post-Ops support in addition to the improved performance for the existing APIs in aocl_gemm add-on

The package containing AOCL-BLAS Library binaries that includes optimizations for AMD processors, examples, and documentation can be found in the Download section.

Documentation

Source code: GitHub.

 

AOCL-LAPACK

AOCL-LAPACK is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. Applications relying on standard Netlib LAPACK interfaces can utilize AOCL-LAPACK with virtually no changes to their source code. AOCL-LAPACK supports C, Fortran, and C++ template interfaces (for a subset of APIs) for the LAPACK APIs.

AOCL-LAPACK is compatible with LAPACK 3.11.0 specification. In combination with the AOCL-BLAS library, which includes optimizations for the AMD “Zen”-based processors, AOCL-LAPACK enables running high performing LAPACK functionalities on AMD platforms. 

Highlights of AOCL-LAPACK 5.0

  • Improved performance of the following APIs through AVX2 and AVX512 SIMD instructions:
    • Double Precision SVD (DGESVD)
    • LU Factorization/Solver routines for general matrices (DGETRF, ZGETRF, DGETRS, and DGESV)
    • Matrix inverse routine DGETRI for small sizes
    • Least Square solver DGELS for small sizes
    • Double Precision Auxiliary routine and DLARFG
  • Improved performance of the following APIs using local AOCL-BLAS optimized kernels:
    • LU Factorization/Solver routines for band storage matrices (DGBTRF and DGBTRS)
  • Option to set specific ISA code path at runtime through the AOCL_ENABLE_INSTRUCTIONS environment variable
  • Sphinx-based AOCL-LAPACK API documentation
  • pkgconfig support on Linux with CMake builds
  • LAPACK API modifications:
    • Updated AOCL-LAPACK APIs return type to match with corresponding netlib subroutine prototypes
    • Removed xerbla and lsame definition from AOCL-LAPACK. Applications must invoke lsame from the BLAS library 
  • Test suite framework enhancements:
    • Improved accuracy tests including testing with different input generation mechanisms
    • Addition of extreme values, negative, and corner test cases
    • Addition of cases to test numerical stability
    • Support for LAPACKE interface test

Documentation

Downloads

File Name Version Size Launch Date OS Bitness Description
Binary packages compiled with AOCC 5.0
aocl-blis-linux-aocc-5.0.0.tar.gz 5.0 28 MB 10/10/2024 RHEL, Ubuntu, SLES 64-bit

AOCC compiled AOCL-BLAS library binary package sha256
Checksum: 69d7390d47265a0a81cf2911426caf0551f69ddface9b3b895c391622b9bfdcb

aocl-libflame-linux-aocc-5.0.0.tar.gz 5.0 29 MB 10/10/2024 RHEL, Ubuntu, SLES 64-bit

AOCC compiled AOCL- LAPACK Library binary package sha256
Checksum: bbb5bfd8d25851f440db186f5d73b6901b8c1e49245b4c4b300a0c0ac76ec21a

Binary packages compiled with GCC 13.2.1
aocl-blis-linux-gcc-5.0.0.tar.gz 5.0 40 MB 10/10/2024 RHEL, Ubuntu, SLES 64-bit GCC compiled AOCL-BLAS library binary package sha256
Checksum: 185bcb1e33507f3b2bd85215c94bc913baead7ab2bd17664b1dcb82aceee737e
aocl-libflame-linux-gcc-5.0.0.tar.gz 5.0 31 MB 10/10/2024 RHEL, Ubuntu, SLES 64-bit GCC compiled AOCL- LAPACK Library binary package sha256
Checksum: 6380f4b0d8bd3ba893be670c3df02da5b999305f883fe73a0c2617e69cd24b7c
Windows Installer Containing AOCL-BLAS and AOCL-LAPACK
AOCL_Windows-setup-5.0.0.384-AMD.exe 5.0 104MB 10/10/2024 Windows 11, Windows 10 64-bit Windows installer file containing all the AOCL library binaries compiled with Clang 17.
sha256sum: 026405b98e2cf3c529bacdf76eb6e43935b639ed2ab8e90cba22bb992ecf13de