the BKL hasn't mattered in a long time, removing it was nearly purely symbolic unless you were using one of the last few holdouts. So of course it would have no effect on a benchmark. Not sure where the people who can't be bothered to do any proper research and think they know stuff got the idea that BKL removal would affect any benchmarks.
here's a link for anyone who really is as useless at research as anyone here.
http://halobates.de/blog/p/56
As for the 200-line wonder patch, it also has nothing to do with scalability, unless one of the tests is to watch a video while compiling a kernel in a terminal, which is the only case the patch does anything for.
If i understand that patch right then the responsiveness is improved through process grouping. CFQ would normally allocate CPU resources evenly, for example 9 make instances and 1 of vlc, in that case vlc would get about 10% CPU resources while 9 make instances would get the 90%. With the new patch 9 make instances are allocated cpu resources as a group so 9 make instances would get 50% CPU and vlc would get also 50% CPU (of course if it needs so much). For that to work you need cgroups enabled in the kernel. The patch isn't supposed to get more performance but to evenly spread cpu resources and prevent demanding process to starve.
Errrr.... http://en.wikipedia.org/wiki/Giant_lock
Kernel lock: Kernel locks all threads, except one. So only one thread at a time. Removing this kernel lock means that threads still needs to lock, but there is not a total serial thread management going on. Now if you have a single core than no matter what you might hack together; only one process is done at a time anyway.
Now onto multiple cores; multiple threads at once.
Seems like a very simple conclusion to me?
Now if that's not the case then Linux realy sucks balls at scaling...
David is right, the kernel has been progressively removing macro-locks over the last number of years. A few years ago, I know that SGI was looking at the BKL being taken over during ioctls on their multi-pipe GPU systems.
As David has said, different subsystems now have a broad collection of finer grained locking around the kernel calls being made to those subsystems. Removing the BKL will only affect some types of workloads. The workloads that may be affected would absolutely need to be multi-threaded (which further reduces the likelihood of seeing a benefit).
As Michael shows in the benchmark results in this article, the impact to the CPU centric benchmarks is virtually nothing between these kernels.
Interestingly, it looks like the CPU topology allows Linux to scale better with HyperThreading. PC-BSD (FreeBSD) and Illumos (OpenSolaris) both consistently had decreases going from 6 cores to 6 cores + HyperThreading.
But a broad statement about scaling needs to have a context to get some meat. What workloads are you talking about scaling?
Linux hasn't had a single giant lock in a long long time, its had fine grained locking since 2.2, an the BKL was only taken in a few places, though some of them were bad, they've been removed over the last few years. Like all ioctls used to take the BKL, and that was slowly fixed. The GPU drivers were one of the areas that lagged behind, but since we didn't really have much userspace parallelism going on it wasn't that noticable.
Dave.
Well more CPU's -> more compute power. Having multiple processes to schedule should result in:
(total_amount_of_processes + kernel_resource_per_different_kind_of_syscal) / (total_amount_of_cores +- (0.25 * extra_threads_per_core) = good scaling.
0.25 means 25% efficiency with crap like HT and that's generous...