Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code

Written by Michael Larabel in Linux Kernel on 10 January 2024 at 08:30 PM EST. 37 Comments
LINUX KERNEL
It's not too often hearing Linus Torvalds himself raising the alarm bells over performance regressions of the Linux kernel, but that happened this evening with the ongoing Linux 6.8 merge window. Torvalds' AMD Ryzen Threadripper system suddenly was suffering from much longer build times at least as a result of new code for this kernel.

Catching my attention this evening was this message by Linus Torvalds over a "horrendous performance regression" with code slated for Linux 6.8. He noted:
Just a note that I'm currently bisecting into this merge for a horrendous performance regression.

It makes my empty kernel build go from 22 seconds to 44 seconds, and makes a full kernel build enormously slower too.

I haven't finished the bisection, but it's now inside *just* this pull, so I can already tell that I'm going to revert something in here, because this has been making my merge window miserable.

You've been warned,

Linus

That was in response to the scheduler changes for Linux 6.8. For regressing a workload like code compilation speeds being halved is rather surprising as while the Linux kernel lacks common and robust continuous integration (CI), it seems like kernel developers responsible for the changes would notice such a dramatic change... Especially if the code has been through linux-next and the like.

A short time ago he added:
I guess it should come as no surprise that the result is

9c0b4bb7f6303c9c4e2e34984c46f5a86478f84d is the first bad commit

but to revert cleanly I will have to revert all of

b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
f12560779f9d ("sched/cpufreq: Rework iowait boost")
9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor performance estimation")

This is on a 32-core (64-thread) AMD Ryzen Threadripper 3970X, fwiw.

I'll keep that revert in my private test-tree for now (so that I have a working machine again), but I'll move it to my main branch soon unless somebody has a quick fix for this problem.

From that message it is interesting to see Linus Torvalds still rocking an AMD Ryzen Threadripper 3970X workstation. Back in 2020 Torvalds switched to Threadripper after 15+ years with Intel systems. It's a bit surprising that nearly four years later he's still relying on the Threadripper 3970X workhorse considering the much faster performance now available especially with the Ryzen Threadripper 7000 series class systems. In any event, the regression is due to a CPUFreq schedutil governor regression it seems.

Threadripper 3000 series


The CPUFreq schedutil governor performance estimation rework was authored by Linaro and aimed to deal with uclamp limits. But it seems something within there is causing issues. As of writing there's been no other responses or messages from Torvalds on the matter.

But at least with this being spotted early and by Torvalds himself and with still over a week to go until Linux 6.8-rc1, it will hopefully be sorted out in short order and well before the Linux 6.8 stable release due out in March.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week