Changing One "If" To "While" Caused An Unexpected Shift In A Kernel Benchmark This Week

Written by Michael Larabel in Linux Kernel on 10 January 2021 at 02:02 PM EST. 8 Comments
LINUX KERNEL
Several months back you may recall that Linux 5.9 kernel regression we noted that in turn was bisected to code introduced by Linus Torvalds around page lock fairness. That was ultimately worked out in time with allowing a control over the page lock (un)fairness to address the regressed workloads while being fair enough to satisfy his original change. But now this week for Linux 5.11, Linus Torvalds has again altered the behavior. It then ended up causing a PostgreSQL database server performance regression but fortunately any impact should be very minimal and hopefully not appearing in any real-world situation.

Linus this week merged his own patch, mm: make wait_on_page_writeback() wait for multiple pending writebacks. It comes as a fix for his original rewrite of the wait_on_page_bit_common() logic. The issue is seeing occasional reports of BUG_ON() assertions being triggered since that change. Linus ended up uncovering a race condition where the BUG_ON() happens. See that linked patch for all the technical details for those interested. Within the wait_on_page_writeback() function though the patch is just changing an if statement to a while and fixes the BUG_ON assertion that was happening.

But as an unexpected surprise, following that commit on some of my test systems ended up seeing PostgreSQL performance as measured by pgbench now dropping 5~10% lower... This has been mostly happening when firing up 100~250 PostgreSQL clients mostly on lower-end systems / desktop class Core i7/i9 or Ryzen boxes. So fortunately not a common combination of the class of hardware and the amount of PostgreSQL database workload that would normally be handling. On larger servers this regression isn't appearing.

After confirming the behavior on various systems and trying an experimental patch from Linus, at this point it appears to be chalked up as an isolated issue. As summed up by Linus:
I think that something in that PostrgreSQL benchmark is perhaps slightly "modal", and hits some particular pattern that is worse depending on random timing luck.

It might be the exact IO patterns, but it might easily also be just scheduling patterns or even just CPU frequency patterns.

IOW, I don't think the performance results are necessarily reflecting truly how efficiently the work gets done, but basically some almost random fluctuation depending on timing, and then certain patches might cause changes almost incidentally.

We've often seen that on various big machines where the exact NUMA placement of data ends up being one of the "modal" things. Or just code generation and layout changing the I$ footprint, and then something that mostly fit in L1 (or L2) suddenly doesn't quite fit any more, and you get a big change from what is essentially just the same code moving around a bit.

The fact that it happens on just particular machines tends to be part of the pattern.

For now the current writeback bit behavior is being kept as ideally outside of PostgreSQL's pgbench this won't cause any issues. And then again where I am seeing it come up is on consumer hardware and heavily taxing PostgreSQL, an unlikely combination.

But just mentioning this for anyone else seeing differing behavior out of PostgreSQL/pgbench on Linux 5.11-rc3 or later. So far I haven't seen any other workloads at all changing from this single line of code change but important to keep in mind if moving to Linux 5.11-rc3 for testing when it's released later today.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week