Blender's AMDGPU-PRO OpenCL Performance Is Crazy Slow Compared To NVIDIA CUDA
Earlier this week I posted a number of NVIDIA CUDA Blender OpenCL Cycles render benchmarks from various green GPUs. Here are some tests now when making use of Blender's OpenCL support on AMD Radeon hardware when using the latest AMDGPU-PRO Linux driver.
Comparing Blender's CUDA support to the OpenCL back-end though is basically a joke. Many Phoronix readers had made comments on my earlier CUDA Blender articles that it's only the NVIDIA compute API Blender developers really care about, their OpenCL back-end "sucks", etc. Not until carrying out these tests myself did I really understand and believe what they meant.
These AMD OpenCL Blender benchmarks are indeed much slower than the NVIDIA CUDA numbers when running the same tests! I'm inclined to believe it's a shortcoming of the Blender OpenCL back-end based upon what some of the Phoronix readers shared previously and in my other (non-Blender) OpenCL AMDGPU-PRO benchmarks I hadn't seen such poor performance. Additionally, with the results I did of a RX 480, R9 Fury, and R9 285, these Blender OpenCL benchmarks all perform about the same speed... No scaling at all so likely a bottleneck within the Blender code, again with other trusted OpenCL tests I hadn't seen this behavior on AMDGPU-PRO. Related, you may want to see CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal.
600+ seconds to render the BMW Blender scene with AMDGPU-PRO OpenCL... The NVIDIA TITAN X can render that scene in 178 seconds while the slowest NVIDIA card I tested could render it in 376 seconds, about half the time of these OpenCL results.
1620 seconds for the Classroom scene with OpenCL... As little as 433 seconds with CUDA.
That's enough said. You can see more of this data if interested via this OpenBenchmarking.org result file. Or via this OpenBenchmarking.org result merge to see the CUDA and OpenCL numbers on the same page. Simply put, Blender's CUDA support is much better off than the OpenCL code.
Comparing Blender's CUDA support to the OpenCL back-end though is basically a joke. Many Phoronix readers had made comments on my earlier CUDA Blender articles that it's only the NVIDIA compute API Blender developers really care about, their OpenCL back-end "sucks", etc. Not until carrying out these tests myself did I really understand and believe what they meant.
These AMD OpenCL Blender benchmarks are indeed much slower than the NVIDIA CUDA numbers when running the same tests! I'm inclined to believe it's a shortcoming of the Blender OpenCL back-end based upon what some of the Phoronix readers shared previously and in my other (non-Blender) OpenCL AMDGPU-PRO benchmarks I hadn't seen such poor performance. Additionally, with the results I did of a RX 480, R9 Fury, and R9 285, these Blender OpenCL benchmarks all perform about the same speed... No scaling at all so likely a bottleneck within the Blender code, again with other trusted OpenCL tests I hadn't seen this behavior on AMDGPU-PRO. Related, you may want to see CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal.
600+ seconds to render the BMW Blender scene with AMDGPU-PRO OpenCL... The NVIDIA TITAN X can render that scene in 178 seconds while the slowest NVIDIA card I tested could render it in 376 seconds, about half the time of these OpenCL results.
1620 seconds for the Classroom scene with OpenCL... As little as 433 seconds with CUDA.
That's enough said. You can see more of this data if interested via this OpenBenchmarking.org result file. Or via this OpenBenchmarking.org result merge to see the CUDA and OpenCL numbers on the same page. Simply put, Blender's CUDA support is much better off than the OpenCL code.
23 Comments