Announcement

**karolherbst** · 03 August 2019, 04:21 PM

yeah... just, that at least for a desktop all of that is only super useless.

I can just give this advice to every application developer who wants their software to run on normal desktop machines: don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V... and looking at those patches, does it mean that _every_ open source driver has to do the same if they want to support that? And then do the same for llvm as well?

...... this is just stupid.

**Madgemade** · 03 August 2019, 04:37 PM

Originally posted by karolherbst View Post

It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

That seems pretty unworkable to me, the binary would end up massively bloated. CUDA doesn't require this sort of thing, it uses an intermediate. Admittedly AMD's ROCm doesn't and that's one of it's main problems, repeating that mistake again seems misguided, especially as this looks like it is aiming for use in general purpose applications. What about when new devices are released? Developers recompile a new version resulting in an ever larger binary??

Originally posted by karolherbst View Post

I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V.

A much better idea. Unfortunately, I don't think the work on SPIR-V support is far enough advanced for this to happen just yet.

**karolherbst** · 03 August 2019, 04:53 PM

Originally posted by Madgemade View Post

That seems pretty unworkable to me, the binary would end up massively bloated. CUDA doesn't require this sort of thing, it uses an intermediate. Admittedly AMD's ROCm doesn't and that's one of it's main problems, repeating that mistake again seems misguided, especially as this looks like it is aiming for use in general purpose applications. What about when new devices are released? Developers recompile a new version resulting in an ever larger binary??

A much better idea. Unfortunately, I don't think the work on SPIR-V support is far enough advanced for this to happen just yet.

You can already compile OpenCL kernels to SPIR-V without problems. So OpenMP should be doable as well.

**oleid** · 03 August 2019, 05:04 PM

This is mostly interesting for number crunching, like most OpenMP code. When doing stuff like this, you already compile one binary optimized for your target machine. So no problem here.

**Madgemade** · 04 August 2019, 02:36 AM

Originally posted by karolherbst View Post

You can already compile OpenCL kernels to SPIR-V without problems.

True, but you can't run them on AMD or Nvidia hardware because they don't have OpenCL 2.2 support. Only the latest GCN GPUs even have 2.0, The first few generations lost all OpenCL support when fglrx died.
I expect this explains why they took the approach they did. I think there is work being done (might be finished) on allowing OpenCL SPIR-V to run on the Vulkan runtime. That would allow this approach to work because it bypasses the unimplemented OpenCL runtime itself.

**karolherbst** · 04 August 2019, 05:30 AM

Originally posted by Madgemade View Post

True, but you can't run them on AMD or Nvidia hardware because they don't have OpenCL 2.2 support. Only the latest GCN GPUs even have 2.0, The first few generations lost all OpenCL support when fglrx died.

supporting SPIR-V is trivial. We already do so in mesa for essentially every gallium driver (even though not upstream and not all driver have the CL boilerplate code done yet).

Originally posted by Madgemade View Post

I expect this explains why they took the approach they did. I think there is work being done (might be finished) on allowing OpenCL SPIR-V to run on the Vulkan runtime. That would allow this approach to work because it bypasses the unimplemented OpenCL runtime itself.

The biggest problem with that is, that Vulkan still lacks important features in order to do so correctly. Although I think the missing features aren't all that important. Anyway, a CL runtime is quite small compared to OpenGL or Vulkan anyway.
The biggest thing with CL is the compiler, but for that one could just use clang and get LLVM or SPIR-V out of it.

**DrYak** · 04 August 2019, 08:16 AM

Originally posted by karolherbst View Post

don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

for 99% of developers who are going to uses these tech, compile the shader for all GPUs on every machine is going to translate into compile a single shader binary for the exact precise architecture that is present on the HPC on which I intend to run my number crunching.

Just saying.

**wizard69** · 04 August 2019, 01:49 PM

Depends upon the user, if you build for your target machines all the rest of the GPUs in the world don’t matter. Think large compute installations running code that doesn’t exist anywhere else in the world.

Originally posted by karolherbst View Post

yeah... just, that at least for a desktop all of that is only super useless.

I can just give this advice to every application developer who wants their software to run on normal desktop machines: don't use thise new fancy openacc or openmp gpu offloading stuff. It requires to compile the final shader binary for all GPUs, if you want to run that application on every machine.

I have no idea why it's not required for OpenMP implementation to add a device agnostic IR like SPIR-V... and looking at those patches, does it mean that _every_ open source driver has to do the same if they want to support that? And then do the same for llvm as well?

...... this is just stupid.

**wizard69** · 04 August 2019, 01:52 PM

Yep. People missed the market this is intended for completely.

Originally posted by DrYak View Post

for 99% of developers who are going to uses these tech, compile the shader for all GPUs on every machine is going to translate into compile a single shader binary for the exact precise architecture that is present on the HPC on which I intend to run my number crunching.

Just saying.

Announcement

AMD Radeon GCN Offloading Support For OpenMP/OpenACC On The Way For GCC 10

AMD Radeon GCN Offloading Support For OpenMP/OpenACC On The Way For GCC 10

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment