Announcement

Collapse
No announcement yet.

AMD Publishes Zen 2 Compiler Patch "znver2" Exposing Some New Instructions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Publishes Zen 2 Compiler Patch "znver2" Exposing Some New Instructions

    Phoronix: AMD Publishes Zen 2 Compiler Patch "znver2" Exposing Some New Instructions

    With GCC 9 feature development ending in November, AMD today sent out their first patch enabling Zen 2 support in the GNU Compiler Collection via the new "znver2" target...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I just wonder if they are going to fix scheduler cost tables because right now code compiled with znver1 in most cases is slower than compiled with haswell or skylake when run on ryzen CPU.

    Comment


    • #3
      Originally posted by Anty View Post
      I just wonder if they are going to fix scheduler cost tables because right now code compiled with znver1 in most cases is slower than compiled with haswell or skylake when run on ryzen CPU.
      Latencies and scheduler was retuned after final hardware arrived (for gcc 8). If you have benchmarks where gcc 8 produces worse code with zen tuning than skylake, I would be interested in looking into them.

      Comment


      • #4
        Originally posted by hubicka View Post

        Latencies and scheduler was retuned after final hardware arrived (for gcc 8). If you have benchmarks where gcc 8 produces worse code with zen tuning than skylake, I would be interested in looking into them.
        Very quick response from the developer himself. I was about to answer him that you did some optimization work in this area but wasn't sure anymore if it went into gcc8 or gcc9. By the way, I've noticed your June 2018 update on LTO and that it went through smoothly. Well done! One question though: Have you experimented with enabling GRAPHITE optimizations on top of LTO on packages where it could be beneficial? I don't have any hard numbers, but I did notice a notable reduction in RAM usage using these flags on my custom Kernel build. But as I also used a custom config I cannot attribute all of it to GRAPHITE. As many improvements went into gcc8 for GRAPHITE it has become more usable (I got ICE's before on ffmpeg or VLC with gcc7) and I'd like to see all of these optimization work used more commonly all around in the Linux world. Could you tell me what is holding back more widespread usage of these processor agnostic optimizations by the distros?

        Comment


        • #5
          Originally posted by hubicka View Post

          Latencies and scheduler was retuned after final hardware arrived (for gcc 8). If you have benchmarks where gcc 8 produces worse code with zen tuning than skylake, I would be interested in looking into them.
          In free time I can look for cases I mentioned - but AFAIR GCC 8 also exhibit this behavior in AVX/AVX2 heavy code. Impact was not huge but measurable and reproducible.

          Comment


          • #6
            You previously wrote:

            When checking the latest model data, the later Family 17h Models up through 2Fh (47) are indeed for Zen 2.
            but this patch says:
            Code:
            + if (model >= 0x30)
            + __cpu_model.__cpu_subtype = AMDFAM17H_ZNVER2;
            Last edited by BoMbY; 31 October 2018, 08:01 AM.

            Comment


            • #7
              If I remember correctly, those cache instructions are for PMEM (Persistent Memory) such as Intel's Optane NVDIMMs. This looks like AMD adding these for compatibility.

              Does anyone know what AMD's plans are regardiing NVDIMM and PMEM?

              Comment


              • #8
                Originally posted by hubicka View Post

                Latencies and scheduler was retuned after final hardware arrived (for gcc 8). If you have benchmarks where gcc 8 produces worse code with zen tuning than skylake, I would be interested in looking into them.
                Jan, I have generated comparison for sandybridge ivybridge haswell broadwell skylake znver1, for optimization levels -O2, -O3, -Ofast and -Os for GCC 7.3 and GCC 8.2.
                What I realize now is GCC 8.2 has serious problems when compared to 7.3 - sometimes more than 50%! Anyway my report shows that -march= haswell, broadwell or skylake is win-win situation when using Ofast.

                I will send you full info to private email soon with all the details.

                Comment

                Working...
                X