OpenBLAS 0.3.26 Brings More x86_64 Optimizations, Better LoongArch64 & ARM64

Written by Michael Larabel in Programming on 7 January 2024 at 08:49 AM EST. Add A Comment
PROGRAMMING
OpenBLAS 0.3.26 was released this week as the newest feature update to this open-source Basic Linear Algebra Subprograms (BLAS) library.

OpenBLAS 0.3.26 features much faster GESV performance for small problem sizes, pulls in various fixes from the reference LAPACK code, various build system improvements, and a number of architecture-specific optimizations and fixes.

On the x86_64 side, OpenBLAS 0.3.26 fixes the CASUM computation on Skylake-X and newer targets in cases where AVX-512 is not supported, other AVX-512 related fixes, works around a problem in the pre-AVX kernel for GEMv, and speeds up thread management on Microsoft Windows.

AMD and Intel x86_64 CPUs


OpenBLAS 0.3.26 also fixes several issues on ARM64 (AArch64), provides some new optimizations for Neoverse-V1 and other performance tuning, support for the Apple M1 and newer targets for DYNAMIC_ARCH builds, and more. There are also various IBM POWER optimizations and new/improved optimized kernels for almost all BLAS functions on LoongArch64.

Downloads and more details on the OpenBLAS 0.3.26 release via GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week