Announcement

Collapse
No announcement yet.

The Technical Workloads Where AMD Ryzen 9 7900X3D/7950X3D CPUs Are Excellent

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Technical Workloads Where AMD Ryzen 9 7900X3D/7950X3D CPUs Are Excellent

    Phoronix: The Technical Workloads Where AMD Ryzen 9 7900X3D/7950X3D CPUs Are Excellent

    While the AMD Ryzen 9 7900X3D and Ryzen 9 7950X3D are promoted as great "gaming processors", these new Zen 4 desktop CPUs with 3D V-Cache also have great capabilities for various technical computing workloads thanks to the hefty cache size. In prior articles I've looked at the Ryzen 9 7900X3D/7950X3D in around 400 workloads on Linux while in this article I am looking more closely at these technical computing areas where these AMD Zen 4 3D V-Cache processors show the most strength and value outside of gaming.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I've looked at these new x3d series almost with a slight disbelief.
    Amazing numbers. Like some loads are double the performance for half the power?!?

    I think the X3D models have paved way for a new chapter in desktop CPU design philosophy.
    The things we run (+data) have grown fat and it's obvious that this really helps feeding the little exec monsters hidden within.

    Comment


    • #3
      And to think, AMD still has room for improvements if they can get all CCDs coverage and solve the clock speed hit with 3rd gen 3D v-cache on Zen 5.
      Would end up with no detriment for frequency dependent workloads that can't take advantage of the extra cache, while offering an extreme uplift for multithreaded workloads that are cache and frequency dependent, and not having any scheduling problems because all cores are equal.

      Comment


      • #4
        Originally posted by Namelesswonder View Post
        And to think, AMD still has room for improvements if they can get all CCDs coverage
        With all the talk about the non-uniform performance characteristics of these cores, I think it's important to remember that all the cores do have access to this expanded L3 cache.

        The only difference is that accessing it is a few clocks slower on some cores, and faster on other cores.

        For workloads that like the bigger cache, even the CCD without the L3 cache directly attached to it is still going to perform better than a core on a 7950X or other cpu without the extra cache.

        Comment


        • #5
          Originally posted by smitty3268 View Post

          With all the talk about the non-uniform performance characteristics of these cores, I think it's important to remember that all the cores do have access to this expanded L3 cache.

          The only difference is that accessing it is a few clocks slower on some cores, and faster on other cores.

          For workloads that like the bigger cache, even the CCD without the L3 cache directly attached to it is still going to perform better than a core on a 7950X or other cpu without the extra cache.
          Anandtech has cross-CCD latency around 77 ns, which is practically the same as going to DRAM.

          Comment


          • #6
            Originally posted by Namelesswonder View Post
            And to think, AMD still has room for improvements if they can get all CCDs coverage and solve the clock speed hit with 3rd gen 3D v-cache on Zen 5.
            Would end up with no detriment for frequency dependent workloads that can't take advantage of the extra cache, while offering an extreme uplift for multithreaded workloads that are cache and frequency dependent, and not having any scheduling problems because all cores are equal.
            I dont have a link for this (as Google SEO is destroying my ability to find anything :/) but AMD said they didn't sell a 5950X3D because interchip communication was a huge bottleneck in tests. The IO die apparently couldn't handle so many cross die L3 requests.


            Anyway even if that was "fixed," it would still be niche in the presence of low end epyc/threadripper/Intel HEDT. If you can use 32 threads, you can *probably* use more.
            Last edited by brucethemoose; 09 March 2023, 05:17 PM.

            Comment


            • #7
              Originally posted by milkylainen View Post
              I've looked at these new x3d series almost with a slight disbelief.
              Amazing numbers. Like some loads are double the performance for half the power?!?
              I don't have a sample to prove my guessing, but IMHO there are some "simple" reasons:
              1) lower clocks require lower voltages. The relationship between voltage and energy consumption is quadratic, so small drops in voltage cause consistent drops in power usage
              2) the non-X3D variants were tuned extremes clocks to keep up in the benchmarks (expecially single core) with high-end intel parts, which are in turn tuned to the very extremes (see the power consumption of the 13900K for example); the large L3 cache of the X3D models allows for better all-around performance so clocks and voltages can be lowered a bit to more reasonable setpoints

              There is also the power cap equation in this, because modern processors adapt frequency depending on the load, power and temperature, to stay within the "power envelope". There are also benchmarks around with regular X models with lowered TDP from motherboard bios and they are still performing almost like at full power.

              In my opinion, the Zen 4 cores were already very very efficient and performing, and it is not that the X3D models are consuming low power, but the opposite: the non-X3D models are "overfed" and consume too much (for the reasons stated above).

              Comment


              • #8
                I would have expected zstd decompression speed for payload created with the --long mode to improve thanks to the extended L3 cache.

                "Normal" zstd-compressed files are designed to fit into "normal" cache sizes, so they don't benefit from extended L3 cache.

                But that's probably a subtle distinction which is not measured in benchmarks.

                Comment


                • #9
                  As we already seen with Epyc, 3D Cache gives some really nice performance improvements

                  But one thing worries me, ever since Ryzen 7000/3D Cache CPUs were released, benchmarks on Phoronix tend to be more and more tailored for these new cpus (read: favor CPUs with AVX-512, large cache and DDR5).

                  This makes new CPUs look really good compared to previous generations, but how many people really use things like GPAW, ASKAP, OpenFOAM, perform hydrodynamics simulations or AI interference on CPU instead of GPU? These CPUs are surely great for solving turbulence problems on a Cartesian mesh (XCompact3D) but I doubt many of us actually care about it.
                  Such tests may make sense for Epyc, but too many such AVX-512/memory bandwidth oriented tests on consumer grade parts can give skewed results on overall typical part performance.

                  Comment


                  • #10
                    Originally posted by milkylainen View Post
                    I've looked at these new x3d series almost with a slight disbelief.
                    Amazing numbers. Like some loads are double the performance for half the power?!?

                    I think the X3D models have paved way for a new chapter in desktop CPU design philosophy.
                    The things we run (+data) have grown fat and it's obvious that this really helps feeding the little exec monsters hidden within.
                    And yet, people keep giving money to Intel.

                    Comment

                    Working...
                    X