Announcement

Collapse
No announcement yet.

VOPD Scheduler For Valve's ACO Compiler Merged Into Mesa 24.1

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VOPD Scheduler For Valve's ACO Compiler Merged Into Mesa 24.1

    Phoronix: VOPD Scheduler For Valve's ACO Compiler Merged Into Mesa 24.1

    A pull request open for the past eight months for implementing a VOPD scheduler for the Valve-developed ACO "AMD Compiler" back-end has now been merged for Mesa 24.1-devel...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Hi Michael,

    When you have time, could you please do a comparison benchmark with the current state of Zink vs RadeonSI ACO vs RadeonSI LLVM ?

    Comment


    • #3
      Kinda pointless telling us about this without explaining what VOPD is or how this will be useful

      Comment


      • #4
        This PR applies cleanly on top of 24.0.0. Time to see if there is any noticeable difference in performance.

        Comment


        • #5
          Originally posted by FireBurn View Post
          Kinda pointless telling us about this without explaining what VOPD is or how this will be useful
          Seems to be an instruction telling the GPU to process two arithmetic instructions in parallel. So this optimization looks for compatible pairs of instructions and unifies them in the right way so they can be executed in parallel.

          Corrections / additions welcome, this is just from skimming the PR...

          Comment


          • #6
            Originally posted by FireBurn View Post
            Kinda pointless telling us about this without explaining what VOPD is or how this will be useful
            VOPD allows to use the doubled shader count Navi 3 has (but AMD doesnt advertise) on Wave32 instructions

            Comment


            • #7
              Originally posted by V1tol View Post
              This PR applies cleanly on top of 24.0.0. Time to see if there is any noticeable difference in performance.
              Any updates?

              Comment


              • #8
                From some quick research, it seems this is specifically for RDNA3.

                Comment


                • #9
                  Originally posted by FireBurn View Post
                  Kinda pointless telling us about this without explaining what VOPD is or how this will be useful
                  It's the technical term for the the dual-issue stuff introduced with RDNA3. Essentially, compared to RDNA2 the architecture has twice as many ALUs, but with the catch that the additional ones can only be utilized whenever two arithmetic/logic operations can be combined into a single VOPD instruction.

                  So best case scenario is double the throughput/twice the TFLOPs. For games it's probably more around +5 to +20% performance, depending on how much optimization the shader compiler can do.
                  Last edited by kiffmet; 05 February 2024, 03:55 PM.

                  Comment


                  • #10
                    2x ALU is automatic with Wave64. Only Wave32 must use VOPD to utilize it.

                    Comment

                    Working...
                    X