Taking a break from our graphics excitement last week with the release of AMD's 8.42.3 Display Driver, we have finished our largest (and most time consuming) Linux performance comparison to date. We have taken the last 12 major kernel releases, from Linux 2.6.12 to Linux 2.6.23, built them from source and set out on a benchmarking escapade. This testing also includes the Linux 2.6.24-rc1 kernel. From these benchmarks you can see how the Linux kernel performance has matured over the past two and a half years.
most benchmarks vary a bit from run to run under the same conditions. eg if i run hdparm -t /dev/sda 3 times in a row i get
Timing buffered disk reads: 190 MB in 3.00 seconds = 63.30 MB/sec
Timing buffered disk reads: 190 MB in 3.02 seconds = 62.82 MB/sec
Timing buffered disk reads: 194 MB in 3.02 seconds = 64.29 MB/sec
that is a bigger variation than i might get with different kernels.
if you could indicate the normal variation on the graphs that would make them a lot more useful.
a rigorus method would be to do many runs and show the standard deviation. but even the min and max from 3 or 4 runs would be better than nothing.
Good article, but what it lacks off are conclusions. You've done a huge job. The part of reviewing the kernels feautures is awesome, but... why did you choose theese particular tests? Kernel is a huge system. What subsystems did you want to benchmark? From your tests i can make one conclusion -- with an addition of bunch of feautures kernel doesn't run slowly then the previous versions.
I like this benchmark but i would like it even more if you could compare it to Windows XP and Windows Vista on the same hardware with roughly the same benchmarks. that will give a real idea if linux is actually faster than windows.
Damn isn't this frustrating? Compiling, benchmarking, evaluating for a few days just to find out that nothing remarkably has happened. I was quite dissappointed. Of course a bunch of new features were added, but no speed bumps, neither positive nor negative.
Most of the tests were OS agnostic (ramspeed, lame) and those that did tests kernel aspects (gunzip? hdparm) only tested the I/O sub-system.
As long as you run a single process that isn't really system-call-intensive, and/or thread/process intensive, the kernel isn't really playing a role in the game - creating (close-to) identical results across the board.
On the other hand, if you run simultaneous benchmarks (read: encoding an MP3 file while running UT2K4, running hdparm while running gunzip, etc) - small changes in the kernel's scheduler, I/O layers and/or driver APIs should increase the variation in the benchmark scores.
I agree, although it's promising that the benchmarks don't show a standard incline that would imply kernel-bloat, they also don't stress the kernel's scheduler and IO subsystem.
I am also curious about the configuration of each test kernel? How did you chose to config the kernels? Did you use defconfig or did you hand tune them? I ask because much of the new support in these kernels is turned off by default and needs to be enabled, which might also point to the flat curves.
Another interesting set of tests would be to test the virtualization subsystems in the kernel to see how they have improved over time. Examples would be to create a VM and run it on various kernels to see the performance of the host kernel and then lock down a host kernel and run various versions of the test kernel in a VM. Some of the newer features in the kernel (i.e. tickless timers and all the virtualization stuff) would hopefully flex it's muscles in those tests.