Announcement

Collapse
No announcement yet.

AMD Llano Compiler Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Llano Compiler Performance

    Phoronix: AMD Llano Compiler Performance

    Last week were a set of AMD Fusion A8-3850 Linux benchmarks on Phoronix, but for you this week is a look at the AMD Fusion "Llano" APU performance when trying out a few different compilers. In particular, the latest GCC release and then using the highly promising Clang compiler on LLVM, the Low-Level Virtual Machine.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Nice benchmarks. Michael there are many programmers in your forum, so benchmarks like this are very useful. Not just the compilers, but we can answer question:"What CPU i should buy, that will bring me max compiler performance for the buck."
    Thanks.

    Comment


    • #3
      Obviously kidding, but...

      LLVM is used by Mono? Well, I guess I'll have to remove it from my system. I wouldn't use any applications infected by that.

      You probably shouldn't, either.

      Comment


      • #4
        As usual, each test case was using its stock compiler flags during the build process for each of the tested compilers.
        That's not even -O2, right?

        Comment


        • #5
          Originally posted by ChrisXY View Post
          That's not even -O2, right?
          It depends on the software and what's in the make files, most software uses -O2, libav / ffmpeg and now Firefox use -O3 I believe

          Comment


          • #6
            Compiler flags matter

            I'd like to note that the Phoronix habit of using some compiler flags, without caring much which flags, can be seen e.g. on the PovRay numbers.
            povray-3.6.1, being a 2004ish package, has this jam in configure:
            k8-*|x86_64-*) pov_arch="k8"; pov_arch_fallback="i686";;
            This means, if the compiler accepts -march=k8 -mtune=k8 (-O3 -msse2 and a couple of other options), that will be used to compile it, instead of tuning for the CPU you are compiling on, or at least tuning for contemporary CPUs.
            Looking at speed of program optimized for a completely different CPU than you are using is uninteresting, either you tune for contemporary CPUs (as most distributions do and several compilers even default to), or optimize for your own CPU.
            Looking at povray 3.7.0 rc3, this has changed there quite a bit (though it is still at least two years behind on CPUs and features it wants to use).
            E.g. gcc is by default configured to tune for -mtune=generic, which is tuning for recentish Intel and AMD CPUs, but also supports -march=native/-mtune=native and/or -Ofast options which tune for the CPU running the compiler.
            I don't have a Llano CPU, so I couldn't repeat the measurements there, but have run (just single time each, just to show that the compiler flags really matter) it on an Intel i7-2600 CPU:

            gcc 4.6.0 20110603 (Red Hat 4.6.0-10) -O3 -march=k8 -mtune=k8
            Total Time: 0 hours 11 minutes 30 seconds (690 seconds)
            gcc 4.6.0 20110603 (Red Hat 4.6.0-10) -O3 -march=x86-64 -mtune=generic
            Total Time: 0 hours 8 minutes 24 seconds (504 seconds)
            gcc 4.6.0 20110603 (Red Hat 4.6.0-10) -O3 -march=corei7 -mtune=corei7
            Total Time: 0 hours 8 minutes 10 seconds (490 seconds)
            gcc 4.6.0 20110603 (Red Hat 4.6.0-10) -O3 -march=corei7-avx -mtune=corei7-avx
            Total Time: 0 hours 8 minutes 10 seconds (490 seconds)
            gcc 4.7.0 20110822 (experimental) -O3 -march=corei7-avx -mtune=corei7-avx
            Total Time: 0 hours 7 minutes 57 seconds (477 seconds)
            clang 2.9 (tags/RELEASE_29/final) -O3 -march=k8 -mtune=k8
            Total Time: 0 hours 9 minutes 28 seconds (568 seconds)
            clang 2.9 (tags/RELEASE_29/final) -O3 -march=corei7 -mtune=corei7
            Total Time: 0 hours 9 minutes 32 seconds (572 seconds)
            clang 2.9 (tags/RELEASE_29/final) -O3 -march=corei7 -mtune=corei7 -mavx
            Compiler Crash

            As can be seen, yes, gcc k8 tuned code on Intel SandyBridge is significantly slower than clang k8 tuned code, but all other tunings, even just tuning for a generic CPU, is faster, some significantly. From what I've seen, similar hardcoded options (some time ago I saw even -O option being used in one of the phoronix benchmarks, haven't rechecked if it has been fixed since then or not, I think it was the byte benchmark) exist in many other phoronix benchmarks. E.g. -O1 is (at least for GCC) defined to do only some cheap optimizations, with stress on fast compilation and not making code much hard to debug.

            Until this is changed, I think the compiler benchmarks aren't really useful at all.

            Comment

            Working...
            X