Announcement

Collapse
No announcement yet.

The State Of ROCm For HPC In Early 2021 With CUDA Porting Via HIP, Rewriting With OpenMP

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The State Of ROCm For HPC In Early 2021 With CUDA Porting Via HIP, Rewriting With OpenMP

    Phoronix: The State Of ROCm For HPC In Early 2021 With CUDA Porting Via HIP, Rewriting With OpenMP

    Earlier this month at the virtual FOSDEM 2021 conference was an interesting presentation on how European developers are preparing for AMD-powered supercomputers and beginning to figure out the best approaches for converting existing NVIDIA CUDA GPU code to run on Radeon GPUs as well as whether writing new GPU-focused code with OpenMP device offload is worthwhile...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Any idea how this 2% overhead was estimated?

    Was it a hip-ified codebase compiled for CUDA? If so, I'm a little disappointed, since I've expected it to be only macro Magick and compile down to CUDA.

    OR did they compare hip-ified code vs code written natively for AMD GPUs? If so, I would be interested in what they used for native programming those GPUs.

    Comment


    • #3
      All good for HPC guys with all the devops resources they have to get ROCm working, but for a simple Joe like me the fact that the latest and much vaunted Rocm 4 doesn't work "out of the box" on Kernel 5.8 (Ubuntu 20.04 LTS) is just ridiculous. AMD expects users to downgrade their kernel in order for the standard install not to fail miserably, on what is probably the most widely installed Linux distro by far. If it doesn't work yet, why release it? I appreciate that AMD is trying hard here on their compute stack but this is the kind of frustrating shoot-both-feet-twice mistake that is all too common from AMD, and that just gives CUDA a gale-force tailwind.
      Last edited by vegabook; 21 February 2021, 03:08 PM.

      Comment


      • #4
        Originally posted by oleid View Post
        Any idea how this 2% overhead was estimated?

        Was it a hip-ified codebase compiled for CUDA? If so, I'm a little disappointed, since I've expected it to be only macro Magick and compile down to CUDA.

        OR did they compare hip-ified code vs code written natively for AMD GPUs? If so, I would be interested in what they used for native programming those GPUs.
        Hi, all the tests took place on NVIDIA V100 GPUs as we do not have yet access on AMD hardware (it is mentioned in the slides in case you missed this information), so the original code is CUDA and then we hipify and all the CUDA calls have been HIP and any OpenMP offload remains as it is, just link. So, we use same hardware. Of course, we need to explore for many more applications, we just have few with this overhead. It does not mean that the AMD hardware has overhead of course.

        Comment


        • #5
          There is a typo but there is a YouTube video so I can't report it but I remember it said "as as"...

          ​​​​​

          Comment


          • #6
            Originally posted by vegabook View Post
            All good for HPC guys with all the devops resources they have to get ROCm working, but for a simple Joe like me the fact that the latest and much vaunted Rocm 4 doesn't work "out of the box" on Kernel 5.8 (Ubuntu 20.04 LTS) is just ridiculous. .
            you could upgrade as well. kernel 5.9 should be sufficient. at least I used to use that kernel with everything mainline , i.e dkms stuff not needed.

            Comment


            • #7
              Does anyone know of a Spice simulator that's already been converted this way? I've read about a few Cuda Spices, but not even that many of those.

              Comment


              • #8
                Originally posted by vegabook View Post
                All good for HPC guys with all the devops resources they have to get ROCm working, but for a simple Joe like me the fact that the latest and much vaunted Rocm 4 doesn't work "out of the box" on Kernel 5.8 (Ubuntu 20.04 LTS) is just ridiculous. AMD expects users to downgrade their kernel in order for the standard install not to fail miserably, on what is probably the most widely installed Linux distro by far. If it doesn't work yet, why release it? I appreciate that AMD is trying hard here on their compute stack but this is the kind of frustrating shoot-both-feet-twice mistake that is all too common from AMD, and that just gives CUDA a gale-force tailwind.
                True, this is what annoys me the most!
                Compared to CUDA, I had problems with everything, the distro, the kernel, the driver.
                AMD needs to fix this crap as soon as possible!
                I don't even know why people say this is open source software, because I never had so much compatibility problems with an open source software.

                Comment


                • #9
                  Originally posted by Danny3 View Post

                  True, this is what annoys me the most!
                  Compared to CUDA, I had problems with everything, the distro, the kernel, the driver.
                  AMD needs to fix this crap as soon as possible!
                  I don't even know why people say this is open source software, because I never had so much compatibility problems with an open source software.
                  I'm going to speculate. AMD has tried to throw so much jelly at the wall with ROCm, that the stack is so complex trying to do so much (HiP, OpenCL 2.0, CUDA translation layer, ML libraries, OpenMP, yada yada yada), that they've set themselves an almost impossible task to keep this enormous tangled herd of behemoths up to date. This project feels directionless or worse - run by the marketing department and not the programmers. They're constantly in "oh sh_i_t we need this too ship it now 'cos Sales said so" mode instead of doing a few things really well. How about just doing wgpu and OpenCL perfectly, or maybe Vulkan Compute. Imagine the following conversation between AMD and Google, which I am 100% sure is happening:

                  AMD: "Please support ROCm"
                  Goog: "Sure we'd love to! We don't like Nvidia's monopoly either"
                  AMD: "We've just launched ROCm 4.0!"
                  Goog: "Cool!! [1 week later....] okay I've ported XLA to it but you're asking all my users to downgrade their kernel"
                  AMD: "oh...."
                  Goog: "F-off until you've fixed this"

                  Rinse and repeat until Google gets sick of you.

                  For avoidance of doubt: personally I really want AMD to succeed.
                  Last edited by vegabook; 21 February 2021, 08:01 PM.

                  Comment


                  • #10
                    In case it helps, the "limited to an older kernel" issue only applies if you are using our DKMS kernel driver package.

                    The kernel source code goes upstream quite aggressively (with the exception of a couple of non-upstreamable bits like RDMA) and the binary packages are organized so that you can either install userspace only (with a newer kernel) or userspace and kernel drivers (with an enterprise distro's older kernel).

                    If you have a newer kernel that the DKMS package doesn't support just install the rocm-dev meta-package only, although going to the newest easily obtainable kernel is a good idea in that case.

                    https://rocmdocs.amd.com/en/latest/I...r-AMD-GPU.html

                    This seems to have disappeared from the top level install instructions again - I'll go find out what happened.

                    Another option is to build everything from source, but last time I looked it was still a bit clunky until the build frameworks get cleaned up and harmonized a bit more.

                    Most of our major customers build from source, by the way.

                    Radeon Open eCosystem (ROCm)
                    Minor nitpick - I think it's actually Radeon Open Compute platforM.
                    Last edited by bridgman; 21 February 2021, 08:30 PM.
                    Test signature

                    Comment

                    Working...
                    X