Announcement

Collapse
No announcement yet.

AMD Radeon GCN Offloading Support For OpenMP/OpenACC On The Way For GCC 10

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Radeon GCN Offloading Support For OpenMP/OpenACC On The Way For GCC 10

    Phoronix: AMD Radeon GCN Offloading Support For OpenMP/OpenACC On The Way For GCC 10

    Merged for the GCC 9 compiler release that launched earlier this year was the preliminary AMD Radeon "GCN" GPU compiler back-end. In that initial release it wasn't particularly useful as the GPU offloading bits for the popular programming APIs/models wasn't supported so for now could just run some basic single-threaded programs. But now those interesting GPU offloading bits are pending for GCC 10...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    yeah... just, that at least for a desktop all of that is only super useless.

    I can just give this advice to every application developer who wants their software to run on normal desktop machines: don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

    I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V... and looking at those patches, does it mean that _every_ open source driver has to do the same if they want to support that? And then do the same for llvm as well?

    ...... this is just stupid.

    Comment


    • #3
      Originally posted by karolherbst View Post
      It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.
      That seems pretty unworkable to me, the binary would end up massively bloated. CUDA doesn't require this sort of thing, it uses an intermediate. Admittedly AMD's ROCm doesn't and that's one of it's main problems, repeating that mistake again seems misguided, especially as this looks like it is aiming for use in general purpose applications. What about when new devices are released? Developers recompile a new version resulting in an ever larger binary??
      Originally posted by karolherbst View Post
      I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V.
      A much better idea. Unfortunately, I don't think the work on SPIR-V support is far enough advanced for this to happen just yet.

      Comment


      • #4
        Originally posted by Madgemade View Post
        That seems pretty unworkable to me, the binary would end up massively bloated. CUDA doesn't require this sort of thing, it uses an intermediate. Admittedly AMD's ROCm doesn't and that's one of it's main problems, repeating that mistake again seems misguided, especially as this looks like it is aiming for use in general purpose applications. What about when new devices are released? Developers recompile a new version resulting in an ever larger binary??

        A much better idea. Unfortunately, I don't think the work on SPIR-V support is far enough advanced for this to happen just yet.
        You can already compile OpenCL kernels to SPIR-V without problems. So OpenMP should be doable as well.

        Comment


        • #5
          This is mostly interesting for number crunching, like most OpenMP code. When doing stuff like this, you already compile one binary optimized for your target machine. So no problem here.

          Comment


          • #6
            Originally posted by karolherbst View Post
            You can already compile OpenCL kernels to SPIR-V without problems.
            True, but you can't run them on AMD or Nvidia hardware because they don't have OpenCL 2.2 support. Only the latest GCN GPUs even have 2.0, The first few generations lost all OpenCL support when fglrx died.
            I expect this explains why they took the approach they did. I think there is work being done (might be finished) on allowing OpenCL SPIR-V to run on the Vulkan runtime. That would allow this approach to work because it bypasses the unimplemented OpenCL runtime itself.

            Comment


            • #7
              Originally posted by Madgemade View Post
              True, but you can't run them on AMD or Nvidia hardware because they don't have OpenCL 2.2 support. Only the latest GCN GPUs even have 2.0, The first few generations lost all OpenCL support when fglrx died.
              supporting SPIR-V is trivial. We already do so in mesa for essentially every gallium driver (even though not upstream and not all driver have the CL boilerplate code done yet).

              Originally posted by Madgemade View Post
              I expect this explains why they took the approach they did. I think there is work being done (might be finished) on allowing OpenCL SPIR-V to run on the Vulkan runtime. That would allow this approach to work because it bypasses the unimplemented OpenCL runtime itself.
              The biggest problem with that is, that Vulkan still lacks important features in order to do so correctly. Although I think the missing features aren't all that important. Anyway, a CL runtime is quite small compared to OpenGL or Vulkan anyway.
              The biggest thing with CL is the compiler, but for that one could just use clang and get LLVM or SPIR-V out of it.

              Comment


              • #8
                Originally posted by karolherbst View Post
                don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

                for 99% of developers who are going to uses these tech, compile the shader for all GPUs on every machine is going to translate into compile a single shader binary for the exact precise architecture that is present on the HPC on which I intend to run my number crunching.

                Just saying.




                Comment


                • #9
                  Depends upon the user, if you build for your target machines all the rest of the GPUs in the world don’t matter. Think large compute installations running code that doesn’t exist anywhere else in the world.

                  Originally posted by karolherbst View Post
                  yeah... just, that at least for a desktop all of that is only super useless.

                  I can just give this advice to every application developer who wants their software to run on normal desktop machines: don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

                  I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V... and looking at those patches, does it mean that _every_ open source driver has to do the same if they want to support that? And then do the same for llvm as well?

                  ...... this is just stupid.

                  Comment


                  • #10
                    Yep. People missed the market this is intended for completely.

                    Originally posted by DrYak View Post


                    for 99% of developers who are going to uses these tech, compile the shader for all GPUs on every machine is going to translate into compile a single shader binary for the exact precise architecture that is present on the HPC on which I intend to run my number crunching.

                    Just saying.



                    Comment

                    Working...
                    X