Results 1 to 4 of 4

Thread: Slowdown when computing in parallel on multicore CPU

  1. #1
    Join Date
    Feb 2011
    Location
    Near Frankfurt, Germany
    Posts
    2

    Default Slowdown when computing in parallel on multicore CPU

    A couple of months ago I observed some unusual behaviour when I computed integer arithmetic one a single core, then on two cores, then on three cores, and so on. I own an octacore processor (AMD FX(tm)-8120, 3.1GHz). I would have assumed that as long as I have enough cores idling that adding more work to my CPU does not adversely affect performance per task. But quite the contrary, performance goes down, up to 60%. Right now I have no convincing explanation for this behaviour.

    The same is true for Intel i7 (i7-2640, 2.8GHz). Performance deteriorates up to the point when all cores are used. At that moment no further slowdown can be observed.

    The same is also true for floating point calculation on both processors (AMD and Intel). The problem at hand was completely artificial, and there seems to be no memory contention, as all values at hand can easily be kept within the CPU entirely. See http://eklausmeier.wordpress.com/201...ndant-on-load/.

    Thank you for any comments on this.

  2. #2
    Join Date
    Jul 2009
    Location
    Germany
    Posts
    542

    Default

    I think what you see here is the process hoping from one core to another when the others are idle. This causes huge slow downs. This does not happen when all cores are busy -> exactly what you described.
    IIRC linux 3.8 and 3.9-rcX include some optimizations to reduce this core jumping, but you can try to manually intervene and use 'taskset' to set the processes cpu affinity ( http://www.cyberciti.biz/tips/settin...r-process.html )

    /edit:
    http://git.kernel.org/cgit/linux/ker...5da40911d95cf6
    Last edited by droste; 03-18-2013 at 07:17 PM.

  3. #3
    Join Date
    Feb 2011
    Location
    Near Frankfurt, Germany
    Posts
    2

    Default

    Thanks droste. That's an interesting aspect which I haven't thought of.

    I tried
    Code:
    for i in `seq 1 6`; do echo 2 -1 0 -2 | time -f "%e %U %S" taskset -c $i ./intpoly -n0 & done
    but this didn't show any difference in execution time in comparison to the same without taskset. So the basic problem that CPU time per process increases the more processes you have, is still the same.

  4. #4
    Join Date
    Jul 2009
    Location
    Germany
    Posts
    542

    Default

    I just tried it on my PC (i5 with 4 cores, no HT) on Linux 3.9-rc2:

    Code:
    $ for i in `seq 0 3`; do echo 2 -1 0 -2 | time -f "%e %U %S" taskset -a -c `expr $i % 4` ./intpoly -n0 & done
    and

    Code:
    $ for i in `seq 0 3`; do echo 2 -1 0 -2 | time -f "%e %U %S" ./intpoly -n0 & done
    And the results are basically the same. But I always get the (nearly) same execution time per process no matter if I start 1, 2, 3 or 4 processes and for more processes than cores the execution time increases almost linear to the number of new processes. 1 process -> ~3sec, 4 processes -> ~3sec, 40 processes -> ~30 sec. user time and sys time stay the same.

    So I can't rebuild your original results.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •