Page 2 of 5 FirstFirst 1234 ... LastLast
Results 11 to 20 of 49

Thread: CompilerDeathMatch: surprising results

  1. #11
    Join Date
    Jan 2010
    Location
    Ghent
    Posts
    198

    Default

    Indeed.

    These results are quite surprising as such.

    Yet another update:
    http://global.phoronix-test-suite.co...66-29958-25932

    Open64 came out as leader in the "benchmarks without flags" category, followed by TCC.

    From now on the comparisons will be with flags and we will see how that changes the results compared to the baseline showed here for each compiler.

  2. #12
    Join Date
    Jul 2008
    Location
    Greece
    Posts
    3,766

    Default

    I'd suggest -O2, which for GCC and ICC is the most commonly used. As for GCC arch flags, I guess what distros use is a good idea? That would be "-march=i686 -mtune=generic" for 32bit. Not sure what is used for 64bit though.

  3. #13
    Join Date
    Jan 2010
    Location
    Ghent
    Posts
    198

    Default

    Quote Originally Posted by RealNC View Post
    I'd suggest -O2, which for GCC and ICC is the most commonly used. As for GCC arch flags, I guess what distros use is a good idea? That would be "-march=i686 -mtune=generic" for 32bit. Not sure what is used for 64bit though.
    Thanks for your imput. I will try to see which flags that may be most relevant. I might have to set different ones for different compilers too I guess (need to read some manpages).

    Feel free to suggest improvements. The more measurement points the clearer the picture (hopefully).

  4. #14
    Join Date
    Jan 2010
    Location
    Ghent
    Posts
    198

    Default updated with optimizations -GCC

    A small update with GCC-optimizations

    http://global.phoronix-test-suite.co...039-6323-30452

    Not too much did change when optimizations were used (I suppose there are default settings to start with). Surprisingly, -O2 often performed better than -O3. I am considering trying some LTO later.

    Next up will however be similar analyses of optimization levels for ICC, Clang and other compilers that have such options.
    After that I think I will move on to 32-bit benchmarks, where there are a number of other interesing compilers to test...

  5. #15
    Join Date
    Jan 2010
    Location
    Ghent
    Posts
    198

    Default

    Yet another update with icc optimizations

    http://global.phoronix-test-suite.co...1284-1397-9429

    @Michael: sorry if I am spamming phoronix global. I am just of the philosophy "release early, release often" so that feedback can be given as soon as possible.

    Next up will be Clang optimizations. Hopefully it will then regain some of its luster, like icc did with -O2 flags. In general, -O3 seems to be a bad performance choice.

  6. #16
    Join Date
    Jun 2010
    Posts
    10

    Default

    @staalmannen
    Since you'll be running more tests, could you add -Os (optimise for size) to the gcc optimisation options tested? Also for icc, clang and pcc if they have a similar option.

    It is well known that -O3 leads to better performance than -O2 only in very specific cases. The reason is partly because -O3 binaries are larger and that makes me suspect that -Os should perform better than -O2 in some cases.

  7. #17
    Join Date
    Nov 2009
    Posts
    328

    Default

    @staalmannen

    Nice results. Thx for your sharing them. I usually use ICC to compile mplayer achieving around 10% more speed over GCC. If you don't mind could you try those flags on ICC:

    -xSSSE3 -fast -fp-model fast=1 -unroll-aggressive

    -xSSSE3: sets your processor type to core 2
    -fast: enables the major speed optimizations options: -ip -O3 -static
    -unroll-aggressive: unroll loops
    -fp-model fast=1: implements foating points optimization. (-fp-model fast=2 implements more floating points optimizations but less acurate results)

  8. #18
    Join Date
    Nov 2009
    Posts
    328

    Default

    Quote Originally Posted by Mo6eB View Post
    It is well known that -O3 leads to better performance than -O2 only in very specific cases. The reason is partly because -O3 binaries are larger and that makes me suspect that -Os should perform better than -O2 in some cases.
    It's true and some plp have already measured this. As -Os produce small excutables your CPU not waste much time moving data around cache, and in some cases this performs better than -O2 and -O3 optimizations, this is even more important on CPUs with small caches. Some kernels devs recomend -Os flag to compile the kernel.

  9. #19
    Join Date
    Jan 2010
    Location
    Ghent
    Posts
    198

    Default

    Quote Originally Posted by Jimbo View Post
    @staalmannen

    Nice results. Thx for your sharing them. I usually use ICC to compile mplayer achieving around 10% more speed over GCC. If you don't mind could you try those flags on ICC:

    -xSSSE3 -fast -fp-model fast=1 -unroll-aggressive

    -xSSSE3: sets your processor type to core 2
    -fast: enables the major speed optimizations options: -ip -O3 -static
    -unroll-aggressive: unroll loops
    -fp-model fast=1: implements foating points optimization. (-fp-model fast=2 implements more floating points optimizations but less acurate results)
    Sure I will try that after I have tried -O2 and -O3 for Clang and Open64, along with the Os-tests for the 4 compilers supporting it (ICC, GCC, Clang, Open64).

    If anyone knows what flags are recommended for tcc and pcc I am all ears.

    In addition, if anyone knows how to "unclutter" a big result file on phoronix global -that would be appreciated.

    I still want all data in one graph since that actually gives additional value (comparisons between compilers X different optimization levels).

    One pattern that seems to be emerging, for example, is that compile time is not inversely related to optimized final binaries (which often is assumed in interpretations of compiler comparisons).
    Unfortunately binary size is not part of the current compiler benchmark suite. It would have been nice if the suite stored binary sizes for each compilation...

  10. #20

    Default

    a good choice of -march might be 'native'. also see http://en.gentoo-wiki.com/wiki/Safe_Cflags

    in my own tests with a fortran simulation code O3 beats Os (though this is probably not generally true)
    http://www.hep.man.ac.uk/u/sam/zgoubi-optimise/oberon/

    with GCC you might want to look at lto. O3 + lto can make smaller binaries than Os
    http://gcc.gnu.org/wiki/summit2010?a...et=hubicka.pdf

    Also i remember reading an article about how big caches and clever precaching on modern CPUs meant that O3 was better than Os now. i think it was a report by intel. but i can't find it.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •