Announcement

**tildearrow** · 10 December 2020, 08:41 PM

You mean a CPU-based HIP implementation?

**duby229** · 10 December 2020, 08:50 PM

Originally posted by tildearrow View Post

You mean a CPU-based HIP implementation?

Yeah, I wasn't gonna say it, but since you did...

It sounds like AMD provided a CPU that can run some HIP implementation... I know that's not what was meant tho...

**coder** · 10 December 2020, 09:24 PM

I'm not clear on whether cgroups or some other mechanism exists to do this, but certainly pthreads is lacking any way to tell the kernel which threads are communicating with each other and should therefore be scheduled in the same NUMA node. You need that for efficient scheduling of OpenCL workgroups, or whatever the CUDA/HiP equivalent is. Without it, CPUs will be at even more of a disadvantage to GPUs than their raw computational disparity would suggest.

And while I'm ranting about threading issues, Linux really needs a way for the kernel to (voluntarily) manage threadpools within a process. This would both avoid the scenario I'm experiencing where multiple userspace libraries each have their own threadpool and each decide they should run one thread per core. And I have many such processes running at the same time -- almost guaranteed to be scheduled poorly. I'm not too sure, but I think Apple (mostly) solves this with GCD.

**carewolf** · 11 December 2020, 03:43 AM

Damn HIPsters

**boboviz** · 12 December 2020, 04:55 AM

Waiting for Sycl 2.0, Rocm 4 and full support to OpenCl 3.0 (the important part of new OpenCl is C++)

**bridgman** · 14 December 2020, 01:17 AM

Originally posted by boboviz View Post

Waiting for Sycl 2.0, Rocm 4 and full support to OpenCl 3.0 (the important part of new OpenCl is C++)

Wasn't C++ more or less removed from OpenCL 3.0 and replaced with the offline "C++ for OpenCL" compiler ?

**coder** · 16 December 2020, 06:51 AM

Originally posted by set135

see manpage cpuset(7) for some possible hints...

Thanks, but that's not what I'm talking about. I mean threads within a process.

Also, I don't want to explicitly map threads to NUMA nodes -- I just want to tell the kernel: "this group of threads intercommunicates heavily" and let it manage where they run.

Again, look at OpenCL workgroups for an example of this. OpenCL got this right a dozen years ago -- Linux needs to catch up.

Announcement

AMD Provides A CPU-Based HIP Implementation For When Lacking A GPU

AMD Provides A CPU-Based HIP Implementation For When Lacking A GPU

Comment

Comment

Comment

Comment

Comment

Comment

Comment