Page 3 of 3 FirstFirst 123
Results 21 to 27 of 27

Thread: Linux 2.6.38 Kernel Multi-Core Scaling

  1. #21

    Default

    Quote Originally Posted by V!NCENT View Post
    It seems like ReiserFS still issues BKLs.
    Nobody uses it in some serious computing I guess.

    Btw. https://bbs.archlinux.org/viewtopic.php?id=102119&p=1 #25

    2.6.35 with Nick Piggins patches shows 22% improvement on 4 cores machine in sysbench.

  2. #22

    Default

    From looking at those benchmarks i'd say linux kernel scales very well. It's the tools that are used that can't scale too well. Look at c-ray and pts, they scale very well. I guess it all depends on how much work/locks the tool has to do. More complicated scale worse but it is not the kernel that is the bottleneck here.

  3. #23
    Join Date
    Aug 2009
    Posts
    2,264

    Default

    Is there no way to benchmark the kernel by running all the benchmarks at once?

  4. #24
    Join Date
    Jul 2010
    Posts
    69

    Default

    This benchmarks shows nothing. They are not testing kernel at all. And what do you mean by "4 core", "6 cores", etc. Was they disabled in BIOS somehow? I also assume when you enable/disable them in BIOS, you also changed options for all of above benchmarks to use more threads, right?

  5. #25
    Join Date
    May 2009
    Location
    Richland, WA
    Posts
    134

    Default

    I would be interested to see if transparent hugepage support on 2.6.38 makes a difference in multicore performance. It should reduce bus traffic and thus contention for the bus upping performance.

  6. #26
    Join Date
    Jun 2006
    Posts
    311

    Default

    Quote Originally Posted by baryluk View Post
    This benchmarks shows nothing. They are not testing kernel at all. And what do you mean by "4 core", "6 cores", etc. Was they disabled in BIOS somehow? I also assume when you enable/disable them in BIOS, you also changed options for all of above benchmarks to use more threads, right?
    Could you explain this further?

    The 1/2/4/6/6+HT are configured in the BIOS, although I would expect could be hot-unplugged in the kernel as well.

    There are a set of benchmarks that are single threaded and ones that are multi-threaded. I believe that all of the multi-threaded ones are configured relative to the number of (either explicitly or implicitly within the code).

    Note that the numbers are normalized. So you can see that there is reasonable scalability (somewhere between 60-90% of the number CPU cores). For the highly parallelized benchmarks, it's nearly 100%. HyperThreading doesn't give the same gain because the cores are already maxed out and so there isn't as much of a gain by having a second thread on a new core.

    If you contrast this Linux on Linux comparison against the Linux on other *nix at http://www.phoronix.com/scan.php?pag...lti_os_scaling, you will see that Linux does scale reasonable well compared to the other OSes. However, the delta between .35 and .38 kernels was minimal relative to scalability.

    Note that the link at the bottom of the article also points out the absolute performance, which realistically show minimal change between kernels.

  7. #27
    Join Date
    Jul 2010
    Posts
    69

    Default

    Quote Originally Posted by mtippett View Post
    Could you explain this further?
    Oh. For some reason i not seen "Timed PHP compilation", this actually shows something - scalability of filesystem and VFS layer. Indeed we see here big differences. Rest test just CPU scheduler, which essnetially is scalable very well by design even in most simple implementation of it. And particulary in this benchmarks where everything just computes something using one thread per core - in such case scheduler essentially assign them once, and do almost nothing. And in this cases sublinear scalability is only effect of application side, not kernel - you cannot parallelize everything to have linear scalability.

    For real scalability benchmark you need to test more than just cpu intensive applications. It is more about putting lots of threads, using lots of network connections, forwarded packets, lots of opened files by multiple processes and threads, or single file by multiple thread and processes, and mix of them on high load (with numer of threads much more than number of cores), etc. Somewhere when we have pottential for some problem in resource sharing. In this test we do not have any resource sharing at kernel level - each thread is using different core, and shares nothing which will prevent it from running at full speed (from kernel perspective. in userland it will still have some mutexes and barriers - which should be minimized by properly designed parallel program).

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •