AMD AOCC 2.2 Helping Squeeze Extra Performance Out Of AMD EPYC 7002 "Rome" CPUs
AOCC 2.2 does take much longer than GCC 10.2 or Clang 10 in compiling code. However, this is not unexpected due to the extra optimization passes added by AMD in the name of trying to squeeze out greater performance of the resulting binary. Thus it's a trade of longer build times for ideally faster performance.
GCC remains faster than AOCC or Clang for the OpenSSL performance.
For most of the other workloads tested, AOCC 2.2 provided minor advantages on top of Clang 10.
When taking the geometric mean of all the benchmarks carried out, AOCC 2.2 was faintly faster than Clang 10.0.1, but both were about 20% faster than GCC 10.2 on this AMD EPYC 7742 2P server with the particular workloads tested.
If dropping the timed compilation results where AOCC is obviously slower due to the extra optimizations and looking solely at the tests looking at the performance of the resulting binary, the benefits of AOCC 2.2 are even more clear. In that case, AOCC 2.2 is about 6% faster than Clang 10.0.1.
Back when AOCC was first introduced by AMD during the Zen 1 days it was much more rare seeing measurable advantages to AOCC compared to Clang or GCC at the time. Fortunately now with a few years of optimization work for their LLVM/Clang compiler, AOCC 2.x has been showing more potential on their processors for squeezing out added performance. Ideally though hopefully more of these Zen 2 optimizations will get upstreamed in the near future to LLVM/Clang.
If you enjoyed this article consider joining Phoronix Premium to view this site ad-free, multi-page articles on a single page, and other benefits. PayPal or Stripe tips are also graciously accepted. Thanks for your support.