AMD Regression On Linux 5.11 Being Addressed By New CPUFreq Patches

Written by Michael Larabel in AMD on 4 February 2021 at 02:58 PM EST. 3 Comments
AMD
The AMD "frequency invariance" saga with Linux 5.11 continues... While there was a patch to address the previously noted performance regression caused by the introduction of frequency invariance and seen when using the Schedutil governor, a new CPUFreq-side patch series has been proposed instead -- both of which are addressing the performance issue with this new kernel for AMD Zen 2 / Zen 3 systems.

It started out prior to Christmas when I noticed Linux 5.11 regressing on AMD systems when tested out of the box. On Christmas I outlined the findings of the AMD performance drops on Linux 5.11 and had narrowed it down to the frequency invariance support added this cycle. When using the default Schedutil governor, it was easy to now encounter slower performance relative to Linux 5.10 and prior.

Later in January SUSE engineer Giovanni Gherdovich, who worked on AMD frequency invariance implementation, proposed a fix for the regression. Testing confirmed the AMD performance now in much better shape with the patched Linux 5.11 and in some cases even better off than with Linux 5.10.

So all looked good with this proposed patch and then it was just a matter of waiting for it to be mainlined... Well, this week when inquiring about it being mainlined, Linux power management subsystem maintainer Rafael Wysocki of Intel ended up noting some areas for improvement with it... In turn, Rafael ended up writing a new patch even without being able to test it on AMD hardware. Rather than modifying Schedutil, his approach was to improve CPUFreq.

He noted with the prototype patch, "What the patch below does is to add an extra entry to the frequency table for each CPU to represent the maximum "boost" frequency, so as to cause that frequency to be used as cpuinfo.max_freq. The reason why I think it is better to extend the frequency tables instead of simply increasing the frequency for the "P0" entry is because the latter may cause "turbo" frequency to be asked for less often."


Testing so far of that new CPUFreq is quite positive. I have been hammering that patched kernel now on multiple AMD EPYC servers and Ryzen laptops/desktops for the past day and will have more details in the hours ahead for the developers. But long story short, this new patch does also take care of the issue of the regressed performance on Linux 5.11 and replaces the prior patch from last month.

With those results looking good, Rafael has spun up the patch series formally to address this issue. Here is the patch summary of the changed CPUFreq behavior in addressing the problem:
The source of the problem is that the maximum performance level taken for computing the arch_max_freq_ratio value used in the x86 scale-invariance code is higher than the one corresponding to the cpuinfo.max_freq value coming from the acpi_cpufreq driver.

This effectively causes the scale-invariant utilization to fall below 100% even if the CPU runs at cpuinfo.max_freq or slightly faster, so the schedutil governor selects a frequency below cpuinfo.max_freq then. That frequency corresponds to a frequency table entry below the maximum performance level necessary to get to the "boost" range of CPU frequencies.

However, if the cpuinfo.max_freq value coming from acpi_cpufreq was higher, the schedutil governor would select higher frequencies which in turn would allow acpi_cpufreq to set more adequate performance levels and to get to the "boost" range of CPU frequencies more often.

This issue affects any systems where acpi_cpufreq is used and the "boost" (or "turbo") frequencies are enabled, not just AMD EPYC. Moreover, commit db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") from the 5.10 development cycle made it extremely easy to default to schedutil even if the preferred driver is acpi_cpufreq as long as intel_pstate is built too, because the mere presence of the latter effectively removes the ondemand governor from the defaults. Distro kernels are likely to include both intel_pstate and acpi_cpufreq on x86, so their users who cannot use intel_pstate or choose to use acpi_cpufreq may easily be affectecd by this issue.

To address this issue, extend the frequency table constructed by acpi_cpufreq for each CPU to cover the entire range of available frequencies (including the "boost" ones) if CPPC is available and indicates that "boost" (or "turbo") frequencies are enabled. That causes cpuinfo.max_freq to become the maximum "boost" frequency of the given CPU (instead of the maximum frequency returned by the ACPI _PSS object that corresponds to the "nominal" performance level).

Long story short, this AMD regression on Linux 5.11 is still pending but hopefully fixed in mainline within the coming days. While testing is ongoing, with all the data I am seeing so far does confirm this new patch series works out well. It's getting late in the Linux 5.11 cycle but this regression fix is ultimately still expected to land in time. For those curious, I'll have out some new benchmark numbers in the next day or two from this testing. For now, back to benchmarking.

UPDATE (5 Feb 2021): Benchmarks and more details on the new patch.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week