LLVM Dealing With Slower Performance On AMD CPUs When Targeting AMD Zen Optimizations

Written by Michael Larabel in AMD on 9 May 2024 at 04:23 PM EDT. 10 Comments
AMD
Recently there was an LLVM bug report of "Worse runtime performance on Zen CPU when optimizing for Zen." Well, that's not good... Fortunately, that bug is now fixed with the latest LLVM Clang compiler code but other deficiencies in the AMD CPU optimization targeting remain.

Opened last week was the bug report of "[X86] Worse runtime performance on Zen CPU when optimizing for Zen" With a sample code snippet it demonstrated that using "-march=znver4" targeting on an AMD Ryzen 9 7950X processor ended up generating around 25% slower performance than when using the more generic "-march=x86-64-v4" or even the "-march=x86-64" baseline.

LLVM znver4 bug title


As of today this bug report has been closed thanks to this LLVM commit: [X86] Enable TuningSlowDivide64 on Barcelona/Bobcat/Bulldozer/Ryzen Families. That commit message explained:
"Despite most AMD cpus having a lower latency for i64 divisions that converge early, we are still better off testing for values representable as i32 and performing a i32 division if possible.

All AMD cpus appear to have been missed when we added the "idivq-to-divl" attribute - this patch now matches Intel cpu behaviour (and the x86-64/v2/3/4 levels)."

While it's good that original bug report over slower performance when engaging AMD Zen 4 (znver4) tuning is closed, that's not the end of story. Another bug report has since opened for LLVM: "[X86] Worse runtime performance on Zen 4 CPU when optimizing for znver4 or skylake."
"The following code runs around 300% slower on Zen 4 when optimized for znver4 or skylake than when optimized for znver3 or other targets."

That newer bug report found a code snippet that is 300% slower on Zen 4 when using "-march=znver4" (or even the common "-march=skylake" baseline) than when optimized for Zen 3 (-march=znver3) or other CPU targets. An AMD compiler engineer has been assigned to that bug so hopefully it will be root-caused and addressed soon.

AMD has been working a lot on their AMD Optimizing C/C++ Compiler (AOCC) downstream of LLVM but with upstream LLVM/Clang and GCC is where AMD could benefit from greater investment. While traditionally they were very slow in their open-source compiler upstreaming of new CPU targets, with GCC 14 they did provide early Zen 5 (znver5) support ahead of launch. That at least enables all of the new Zen 5 CPU ISA extensions but it doesn't yet provide any new cost table with it still being carried over from Znver4. With upstream LLVM there isn't yet any Znver5 support but at least there it isn't quite as pressing thanks to frequent point releases and six-month feature releases being a big improvement over GCC's infrequent release cadence (months between point releases and annual feature releases). It's still behind Intel's open-source compiler punctuality where with GCC 14 there are Intel CPU targets introduced for Lunar Lake, Panther Lake, Clearwater Forest, and other CPU cores further out than Zen 5.

With compiler bugs like these noted, there's still a lot of potential for additional performance gains if AMD would continue ramping up their GCC and LLVM/Clang upstream engineering especially as EPYC continues enjoying great HPC adoption where compiler optimizations tend to be more common place, developers continue embracing Ryzen and Threadripper processors for their boxes, and high core count servers like with AMD EPYC "Bergamo" make for great CI/CD deployments with speedy build times.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week