GCC 4.7 has improved LTO and now can apparently link-time optimize the kernel. Maybe benchmark-worthy?
In some cases (like properly built http servers) there is a huge amount of work in kernelspace, in fact the more the better.
In any case, a smaller kernel = better cache usage, measuring this would be nice too.
Sys time is very hurge in desktop and server case. n some case better speed = low latency = less user space thread blocked = more performance.
On my server, the kernel time is always at more than 25%.
I'm very interested by benchmark and details for this.