Announcement

Collapse
No announcement yet.

Linux 6.8 Merges Fix For Recent Performance Regression Spotted By Linus Torvalds

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux 6.8 Merges Fix For Recent Performance Regression Spotted By Linus Torvalds

    Phoronix: Linux 6.8 Merges Fix For Recent Performance Regression Spotted By Linus Torvalds

    Last week Linux creator Linus Torvalds spotted a bad performance regression with the early Linux 6.8 kernel state that was leading to his kernel build times doubling. Since then kernel developers were working on analyzing the issue and devising a fix. A few minutes ago the fix has worked its way into the mainline kernel...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Yay, so useless schedutil governor now once again is a bit less terrible...

    Comment


    • #3
      "Fix"

      Comment


      • #4
        There's something wrong with this story.

        Prior to 6.8 the kernel was presumably correctly choosing the frequency with the line as it was:

        return policy->cur;

        This line is returning the value stored in cur.

        However something changed and now that line is returning a value that is too low so the "fix" is to divide the original value by 4, add it to the original value and return that instead.

        To me that looks like a kludgy "solution", a half-assed way of dealing with the issue.

        Instead of adding tape to the problem how about identifying why return policy->cur; is suddenly resulting on a 25% lower margin.

        The other thing that jumps out at me is that according to Linus the "regression" resulted in compile times going from 22 seconds to 44 seconds, 2 times longer.

        How does adding 25% to the value of cur result in a doubling of performance?

        Is it doubling the frequency?

        I suspect we haven't heard the last about this issue and more than likely it will rear its ugly head again.

        Comment


        • #5
          Originally posted by sophisticles View Post
          How does adding 25% to the value of cur result in a doubling of performance?

          Is it doubling the frequency?
          This code controls whether the kernel thinks it needs to request higher clocks from the cpu or not. It's not just 1 step up - it could repeatedly do more and more steps up by getting called repeatedly.

          So yeah, the lowest cpu speed may be half of what the highest frequency is. But that depends on the cpu.

          I won't pretend like I understand what the code is doing, but I think it's telling the kernel to start requesting higher frequencies once it hits 80% busy instead of waiting for 100%. (100% / 1.25 from the patch) So the .25 is a bit of a random number, but that's the reasoning *i think*.
          Last edited by smitty3268; 18 January 2024, 11:01 PM.

          Comment


          • #6
            Originally posted by smitty3268 View Post

            I think it's telling the kernel to start requesting higher frequencies once it hits 80% busy instead of waiting for 100%. (100% / 1.25 from the patch) So the .25 is a bit of a random number, but that's the reasoning *i think*.
            This is a much better explanation than the one above the code if it's correct.

            Comment


            • #7
              Originally posted by smitty3268 View Post

              This code controls whether the kernel thinks it needs to request higher clocks from the cpu or not. It's not just 1 step up - it could repeatedly do more and more steps up by getting called repeatedly.

              So yeah, the lowest cpu speed may be half of what the highest frequency is. But that depends on the cpu.

              I won't pretend like I understand what the code is doing, but I think it's telling the kernel to start requesting higher frequencies once it hits 80% busy instead of waiting for 100%. (100% / 1.25 from the patch) So the .25 is a bit of a random number, but that's the reasoning *i think*.
              If the threshold is 80% then 25 is far from random, it's 20 to get up to 100 and the 5 added in for safety so we can be sure that the value will be > 100 to request an increase.

              Comment


              • #8
                Originally posted by aufkrawall View Post
                Yay, so useless schedutil governor now once again is a bit less terrible...
                Interesting, it has been my choice of governor ever since it's merged. No bad hiccup like ondemand when a sudden high performance requirement appears, yet no crazy fan noise like performance due to always max frequency. It dynamically goes up and down as neccesary swiftly.

                Comment


                • #9
                  Originally posted by leledumbo View Post
                  Interesting, it has been my choice of governor ever since it's merged. No bad hiccup like ondemand when a sudden high performance requirement appears, yet no crazy fan noise like performance due to always max frequency. It dynamically goes up and down as neccesary swiftly.
                  Test stutter/frame times in games/video/browser with different loads, it always caused me some noticeable impact at some point.
                  With some amd_pstate driver (afair it was active that allowed it) it even caused higher power draw vs. performance. For now, I ended up with pstate=guided and powersave governor for the 5700X. It still shows lowest idle clocks for cores while not causing any apparent issues.

                  Comment


                  • #10
                    Originally posted by aufkrawall View Post
                    Test stutter/frame times in games/video/browser with different loads, it always caused me some noticeable impact at some point.
                    With some amd_pstate driver (afair it was active that allowed it) it even caused higher power draw vs. performance. For now, I ended up with pstate=guided and powersave governor for the 5700X. It still shows lowest idle clocks for cores while not causing any apparent issues.
                    Can you elaborate on this?
                    Did amd_pstate=active + performance mode caused stuttering?

                    Comment

                    Working...
                    X