Intel Continues Tuning Glibc's Performance: More FMA'ing
Intel continues contributing performance optimizations to the GNU C Library (glibc) for allowing various functions to make use of modern processor instruction set extensions.
Glibc this year has seen FMA optimizations, its per-thread cache enabled, AVX optimizations, and other performance work contributed in large part by Intel engineers. Glibc isn't gaining weight this holiday season but is continuing to be optimized for speed.
Yesterday there was sinf() tuning including an FMA-tuned version. Longtime Binutils/GCC developer H.J. Lu of Intel found that his fused multiply–add version of the sinf call is 54% faster on average, the minimum is a 25% improvement, and the maximum time was also a 54% improvement. Tests were done on Skylake hardware.
This is part of various benchmarking work going on right now for glibc.
These optimizations -- along with other optimizations -- and other new features will be found as part of Glibc 2.27. One of the other latest feature additions to Glibc is memory protection key support.
Glibc this year has seen FMA optimizations, its per-thread cache enabled, AVX optimizations, and other performance work contributed in large part by Intel engineers. Glibc isn't gaining weight this holiday season but is continuing to be optimized for speed.
Yesterday there was sinf() tuning including an FMA-tuned version. Longtime Binutils/GCC developer H.J. Lu of Intel found that his fused multiply–add version of the sinf call is 54% faster on average, the minimum is a 25% improvement, and the maximum time was also a 54% improvement. Tests were done on Skylake hardware.
This is part of various benchmarking work going on right now for glibc.
These optimizations -- along with other optimizations -- and other new features will be found as part of Glibc 2.27. One of the other latest feature additions to Glibc is memory protection key support.
9 Comments