Announcement

Collapse
No announcement yet.

Ubuntu 22.04 LTS Has A Change On The Way For Systemd-OOMD Being Kill-Happy With Apps

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ubuntu 22.04 LTS Has A Change On The Way For Systemd-OOMD Being Kill-Happy With Apps

    Phoronix: Ubuntu 22.04 LTS Has A Change On The Way For Systemd-OOMD Being Kill-Happy With Apps

    This month Ubuntu developers have been trying to figure out how to best deal with systemd-oomd on Ubuntu 22.04 LTS killing applications like Firefox during high memory/swap use and that leading to a poor user experience when desktop users not being aware of the situation and suddenly finding their software killed...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    A minor edit is needed. The sentence
    The Ubuntu issue has been tracked as systemd-oomd frequently kills Firefox and Visual Studio Code.
    is duplicated.

    Comment


    • #3
      Just a heads up; in some workloads this is not a great change. I've been testing default 22.04 in a 4GB VM. With the changes in proposed, the session hangs now, as if there was no user space OOM killer, when I consume all RAM with a browser (firefox or Chrome) using trackthis.link to load many pages. Previously, the killer would kill the browsers, or sometimes under hard to replicate situations, it would kill the entire session.
      Now, under this stress test, the entire session hangs and the VM has to be rebooted, so it's as good has having the session killed. My two core VM gets to a CPU load average > 70, it's trying so hard to swap. If I had zram (via zram-config) the situation is basically the same, although CPU load becomes very high more quickly.

      earlyoom never kills the session. It doesn't operate at cgroups but on processes, so it ends up killing tabs but leaving the browser alone. It's a little better.
      The best solution is to install a bigger swap and use zswap. Fedora VMs on the same hardware perform much better. My default Fedora install in this type of VM has a 4 GB zram swap.

      I also tried psi-notify, which is packaged as a PPA. This monitors pressure reporting does a desktop notification. It is very good at detecting CPU load, but never once has it told me I am approaching memory pressure. The browsers claim to have low memory management, but the default settings are way too slow to be useful if there is a sudden spike in memory requirements (which I simulate with the automated loading of tabs by trackthis, but it could also be a launch of libreoffice)

      There is some discussion here: https://bugs.launchpad.net/ubuntu/+s...d/+bug/1980169 ... the medium term solution sounds promising. We really need some notification that the session is under memory pressure. That should be possible right now, nothing is being proposed to change the memory pressure statistics implying there is nothing wrong with them, but so far I have not found anything useful.


      Comment


      • #4
        It is quite interesting how irresponsible Red Hat and Canonical are acting ...
        what do they think - that no Unix expert is still alive?
        Why was deleting processes not necessary in the late 1990ies - and now
        with much more memory capacity one thinks it would be a good idea
        to just kill deliberately processes ...
        Was 1st April extended to half a year?
        They should provide such services on their servers ...
        But at least they bundled it with systemd - so one may have a larger target to hate ...
        Similar to GNOME ... or Wayland ... or ...
        We should just kill all programs with reasonable functionality ...
        just a browser to get all the advertisements ... cough!
        Red Hat never believed in GNU/Linux as desktop - and proclaimed it once in a while
        when it was used by scientists with great satisfaction.
        Canonical may have got money - was it Microsoft? Apple? Google? ...
        One could guess who thinks that Linux being used for the desktop would harm their business ...
        Maybe the solution will come in a few years: "Faites votre jeux ... rien ne va plus!"

        Comment


        • #5
          How about not enabling the program killer by default?

          It isn't like this is actually needed by anyone and anyone who does need it would probably be better off by having swap or having more swap added.

          Just imagine you're a pilot in your Ubuntu-Powered 747 Max and systemd-oomd sees 747-gps-autopilot using up all the memory and kills it. Why is that supposed to be a good service? Why can't it see that there's 546GB free on the NVMe and add some swap?

          Which service would y'all rather have by default?

          "I see you're running out of memory, here's some swap."

          or

          "Was that wrong? Should I have not done that? I tell you I gotta plead ignorance on this thing because if anyone had said anything to me at all when I first started here that that sort of thing was frowned upon, you know, cause I’ve worked in a lot of offices and I tell you people do that all the time."

          Comment


          • #6
            With or without systemd-oomd, the way Linux treats swap is just ridiculous. It fills up real memory with garbage cache, then when a program actually needs more, it's like OH NO I DIDN'T SEE THAT COMIN, and uses a few hundred megs of swap. And if you disable swap altogether, things will just randomly crash with OOM. It may take a couple of days, or weeks even, but it WILL happen eventually. I tried to get rid of swap every couple of years on the servers I manage, but nope, not gonna work. You just gotta give it the swap it deserves.

            So the recipe is this: no matter how much ram you have, be it 64 gigs, DOESN'T MATTER. Your idiot system STILL wants JUST 300 MB SWAP with it. Because reasons.

            I don't think this should be rocket surgery. It's basically this:

            - hey kernel, I need more ram!
            - too bad, I don't have any left!
            - how about you throw out just some of that 30GB cache?
            - good idea, here's your free ram!

            But even after decades, Linux still can't get it right.

            I'm not sure if Windows is guilty of the same thing or not, because I don't have to care about it. Why? BECAUSE IT DOESN'T NEED A SEPARATE PARTITION TO WORK. It does whatever it wants, and apparently it's working just fine, because on Windows I never have OOM kills. Never, ever.

            So yeah, in this regard, Linux is essentially a worst of both worlds.
            Last edited by anarki2; 30 June 2022, 07:17 PM.

            Comment


            • #7
              Originally posted by skeevy420 View Post
              Just imagine you're a pilot in your Ubuntu-Powered 747 Max and systemd-oomd sees 747-gps-autopilot using up all the memory and kills it. Why is that supposed to be a good service? Why can't it see that there's 546GB free on the NVMe and add some swap?

              Which service would y'all rather have by default?
              Well, if such a high-availability service is properly designed, I'd want it to kill the process which is obviously misbehaving so it can be restarted from a safe checkpoint, similar to how programs don't have to segfault... it's just better than letting them keep running to corrupt things, potentially hang the kernel, or even accidentally induce something to set a peripheral control register out of spec (i.e. killer poke).

              Comment


              • #8
                Originally posted by anarki2 View Post
                With or without systemd-oomd, the way Linux treats swap is just ridiculous. It fills up real memory with garbage cache, then when a program actually needs more, it's like OH NO I DIDN'T SEE THAT COMIN, and uses a few hundred megs of swap. And if you disable swap altogether, things will just randomly crash with OOM. It may take a couple of days, or weeks even, but it WILL happen eventually. I tried to get rid of swap every couple of years on the servers I manage, but nope, not gonna work. You just gotta give it the swap it deserves.

                So the recipe is this: no matter how much ram you have, be it 64 gigs, DOESN'T MATTER. Your idiot system STILL wants JUST 300 MB SWAP with it. Because reasons.

                I don't think this should be rocket surgery. It's basically this:

                - hey kernel, I need more ram!
                - too bad, I don't have any left!
                - how about you throw out just some of that 30GB cache?
                - good idea, here's your free ram!

                But even after decades, Linux still can't get it right.

                I'm not sure if Windows is guilty of the same thing or not, because I don't have to care about it. Why? BECAUSE IT DOESN'T NEED A SEPARATE PARTITION TO WORK. It does whatever it wants, and apparently it's working just fine, because on Windows I never have OOM kills. Never, ever.

                So yeah, in this regard, Linux is essentially a worst of both worlds.
                That's not how it works. Linux will gladly drop its caches and hand that RAM over to any application that wants it. Up to a point. Which is configurable via sysctl knob vm.vfs_cache_presssue. Set it to something like 500 and Linux will aggresively drop anything it can under pressure. The default is more reasonable and tries to keep some cache and swaps out unused pages instead. The algorithm isn't too terrible if you actually have some swap.

                Regarding Windows.. Try disabling the page file and see how it goes. It'll keep OOM crashing your programs with half your RAM free, not even used as cache. Completely unused. So the solution is to just have some swap. Ideally on some low latency storrge.

                Comment


                • #9
                  Originally posted by skeevy420 View Post
                  How about not enabling the program killer by default?

                  It isn't like this is actually needed by anyone and anyone who does need it would probably be better off by having swap or having more swap added.

                  Just imagine you're a pilot in your Ubuntu-Powered 747 Max and systemd-oomd sees 747-gps-autopilot using up all the memory and kills it. Why is that supposed to be a good service? Why can't it see that there's 546GB free on the NVMe and add some swap?

                  Which service would y'all rather have by default?

                  "I see you're running out of memory, here's some swap."

                  or

                  "Was that wrong? Should I have not done that? I tell you I gotta plead ignorance on this thing because if anyone had said anything to me at all when I first started here that that sort of thing was frowned upon, you know, cause I’ve worked in a lot of offices and I tell you people do that all the time."
                  Canonical has always been extremely cynical about Linux and it's usability. "Oh, what's the most popular distro? Debian? Okay just slap a fancy theme over that and tell people it's Linux for humans." When not being cynical, they were downright delusional with hopelessly aimless and overly optimistic projects like Unity, Mir, Upstart, whatever other useless crap they churned out for the sake of just being different. None of those projects were even particularly innovative or user friendly or even developer friendly, just the same thing as everything else, but different. Meanwhile they dumb down Linux, not in a user friendly way, more just, as a way to mimic what enterprise software does. One stable version, once it's out of date, well, screw you, reinstall the new one. I can't tell if they're lazy, stupid, or actively malicious, or maybe some weird mix of all three, which is usually how most scams work, much like Kickstarter and Crypto scams. Start off with a good idea, reality slaps you in the face, then just desperately try to hold everything together while it's obvious to everyone that it was never going to work from the start.
                  systemd is an actual project trying to innovate Linux and bring it more up to spec with other modern operating systems, a far cry from anything Canonical has ever managed to accomplish. So, when they're not trying to replace completely functioning software that doesn't even need to be replaced (compared to all the stuff on Linux that desperately did need replacing but they never acknowledged), they're just misconfiguring it instead. It reminds me of when I was a younger developer and thought "it feels like more work to learn how other software works, I'll just make my own! That's the same amount of effort, right?" No, no it's not.
                  Canonical is a wanna-be Apple that was just born in the wrong era, an era where the computing industry already made all of it's mistakes, and Canonical foolishly thought it could make those mistakes again, except the rest of the computing world (not just other OSes but even other Linux distros) have far evolved beyond the mistakes Canonical has made and continues to make. At this stage they're just slipping on the same banana peel over and over as less and less people use Ubuntu but more people openly criticize it.
                  I often see people call lootbox-riddled AAA games "cynical" and that's exactly how I feel about Canonical. They don't care about Linux or making it usable, they just want to "feel" like they're making it usable.

                  Comment


                  • #10
                    Originally posted by ssokolow View Post

                    Well, if such a high-availability service is properly designed, I'd want it to kill the process which is obviously misbehaving so it can be restarted from a safe checkpoint, similar to how programs don't have to segfault... it's just better than letting them keep running to corrupt things, potentially hang the kernel, or even accidentally induce something to set a peripheral control register out of spec (i.e. killer poke).
                    I will admit to never having run a server, but I've run Linux on many boxes and raspberry pis for years, often without swap, and I've never had this behavior. I wonder if it's a problem in the distro you use?

                    Comment

                    Working...
                    X