Page 2 of 2 FirstFirst 12
Results 11 to 15 of 15

Thread: GCC performance testing request

  1. #11
    Join Date
    Jul 2009


    We all love ricer flags, ask any Gentoo user about how much faster they make your system. =O

    But seriously, I would be interested in the difference the Graphite framework makes in the newer (4.4+) GCCs with multicore systems.

  2. #12
    Join Date
    Mar 2009


    Currently on my i7 920 box with gcc-4.4.2

    CFLAGS="-march=native -O3 -msse4 -mmmx -floop-interchange -floop-strip-mine -floop-block -pipe"
    With the big 8mb shared L3 cache I think the O3 is probably beneficial more often than not. The last benchmark I read using core2s showed O2 and O3 to be about even with O3 maybe pulling a bit ahead.

  3. #13
    Join Date
    Jan 2007


    I thought I came across intel doing some gcc compile testing in the latest linux kernel podcast - heres the link

  4. #14


    Quote Originally Posted by hmmm View Post
    I thought I came across intel doing some gcc compile testing in the latest linux kernel podcast - heres the link
    The net result doesn't seem obvious to me

    The first post clearly shows that -O2 is a winner over -Os, but later the only test won for -O2 kernel and it seemed like given more time and diligence, kernel built with -Os would equal to -O2.

    However I digress. I still want someone to run Phoronix tests with these three compilers.

  5. #15
    Join Date
    Aug 2007


    Quote Originally Posted by birdie View Post
    Can you PLEASE test these versions of GCC: 4.2.4, 4.3.4 and 4.4.2 using whatever benchmarks you like (the more the better).
    There was a gcc comparison earlier this year

    As for the compiler flag choices, check out ACOVEA and

    An article by Dunlop and others from 2008, "On the Use of a Genetic Algorithm in High Performance Computer Benchmark Tuning", concluded:

    This paper has addressed the issue of extracting the best adapted parameters for the HPL reference benchmark. Adjustment
    of the seventeen tuning parameters to achieve maximum performance is a time-consuming task that must be performed
    by hand. The use of a genetic algorithm is proposed here to manage this task with individuals corresponding to an
    HPL run. Indeed we do not provide here a description of a particular version of a GA. The Acovea framework has been
    used to validate the approach over a Beowulf cluster composed of heterogeneous resources: a majority of so-called
    “small” nodes and two “large” nodes. In particular, starting from a hand-tuned performance of 84 Gflops, it was possible
    to attain the peak performance of 111.6 Gflops on the cluster using a set of parameters determined nearly automatically by
    Last edited by sabriah; 12-14-2009 at 07:23 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts