Announcement

Collapse
No announcement yet.

LLVM Clang 12 Compiler Is Performing Very Well For AMD Ryzen 9 5950X / Zen 3

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • LLVM Clang 12 Compiler Is Performing Very Well For AMD Ryzen 9 5950X / Zen 3

    Phoronix: LLVM Clang 12 Compiler Is Performing Very Well For AMD Ryzen 9 5950X / Zen 3

    Earlier this week I posted some benchmarks looking at the compiler performance of GCC 11 vs. LLVM Clang 12 on the Intel Core i9 11900K "Rocket Lake" processor while in this article the same tests and same software are being carried out on an AMD Ryzen 9 5950X "Zen 3" desktop. With these AMD Linux tests the Clang 12 compiler not only yielded the fastest binaries at -O2 but carried through in the more optimized configurations as well.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    is there something wrong with flto on gcc 11?

    Comment


    • #3
      What surprises me is the big differences, in both directions, especially among the first half of the benchmarks.

      Comment


      • #4
        Originally posted by CochainComplex View Post
        is there something wrong with flto on gcc 11?
        It seems that the flto regresses only on ncnn benchmark but since there are just few benchmakrs done it affects the geomean noticeably. I will take a look on it.

        As mentioned on the other thread, when comparing performance with -fpic/-fPIC one needs to take into account that gcc defaults to -fsemantic-interpositoin (as specified by ELF standard) while clang to -fno-semantic-interposition. This affect performance noticeably since it blocks inter-procedural optimization. So I would use -fno-semantic-interposition for gcc (note that -fno-semantic-interposition in clang is buggy by localising variables)

        For -O2 the main difference is that clang enables vectorization, gcc needs -ftree-vectorize -ftree-slp-vectorize for that. I hope the default will be changed in future. For type of benchmarks tested here vectorization makes noticeable difference.


        Comment


        • #5
          Originally posted by hubicka View Post

          It seems that the flto regresses only on ncnn benchmark but since there are just few benchmakrs done it affects the geomean noticeably. I will take a look on it.

          As mentioned on the other thread, when comparing performance with -fpic/-fPIC one needs to take into account that gcc defaults to -fsemantic-interpositoin (as specified by ELF standard) while clang to -fno-semantic-interposition. This affect performance noticeably since it blocks inter-procedural optimization. So I would use -fno-semantic-interposition for gcc (note that -fno-semantic-interposition in clang is buggy by localising variables)

          For -O2 the main difference is that clang enables vectorization, gcc needs -ftree-vectorize -ftree-slp-vectorize for that. I hope the default will be changed in future. For type of benchmarks tested here vectorization makes noticeable difference.

          thx for your insights - I always like your additional informations.

          Comment

          Working...
          X