Following LTO, Linux Kernel Patches Updated For PGO To Yield Faster Performance

Written by Michael Larabel in LLVM on 14 January 2021 at 09:09 AM EST. 20 Comments
LLVM
Clang LTO for the Linux kernel to provide link-time optimizations for yielding more performant kernel binaries (plus Clang CFI support) looks like it will land for Linux 5.12. With that compiler optimization feature appearing squared away, Google engineers are also working on Clang PGO support for the Linux kernel to exploit profile guided optimizations for further enhancing the kernel performance.

Google engineers on Tuesday posted their latest patches providing the necessary kernel infrastructure around Clang Profile Guided Optimizations (PGO). This is more complicated than LTO support since with compiler PGO functionality it relies on first collecting profiles during run-time to then provide that feedback back to the compiler in order to generate a more optimized binary based on that actual run-time profile/feedback.

PGO can be of big benefit to enhancing run-time performance as shown by the likes of Mozilla Firefox supporting PGO builds. But PGO overall isn't too widely used by upstream open-source projects due to effectively requiring two compiler builds and also needing accurate profiles collected that are representative of real-world workflows. Without accurate profiles, the PGO performance benefits are greatly diminished.

The work by Google on Clang PGO support ends up being over one thousand lines of new kernel code. Exposing PGO counters for the kernel is handled and can be collected via /sys/kernel/debug/pgo/profraw. That kernel profile data can then be processed using LLVM's llvm-profdata tool to then feed it back into Clang during the PGO-optimized build.

While Google's current Clang LTO support is about AArch64, so far the PGO support has just been focused on x86/x86_64. This support should work for other architectures and will be enabled once receiving sufficient testing.

These latest Clang PGO patches for the Linux kernel are up for review on the mailing list.

It will be fun to try out a Clang PGO'ed kernel once the infrastructure is hopefully mainlined but still is unlikely to be widely used given the complexities involved and needing accurate profiles that are representative of a given user's workflow. Thus mostly a niche feature but should payoff for those running specialized and dedicated server workloads.

This infrastructure is just for Clang's PGO support and does not (currently) support the GCC PGO functionality.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week