RadeonSI & Intel Both Get Patches For Boosting Compute Shader Performance
Both the Intel i965 and AMD RadeonSI drivers within Mesa have seen separate work done over the past day for boosting the performance of compute shaders with these open-source OpenGL drivers.
The RadeonSI patch enables scratch coalescing when used in conjuction with LLVM 3.9 SVN code. The patch by Marek Olšák notes, "This makes one particular compute shader 8x faster."
Unfortunately, the details on that particular compute shader weren't shared nor the results from any other popular games relying upon CS.
Separately, the Intel Mesa driver also received a commit that boosts their compute shader performance. The Intel patch is about properly using the correct number of threads per compute shader. It turns out up to now they were using the wrong count for the number of available threads available on the GPU.
With now being able to use all the threads available for compute shaders, the CS performance is a heck of a lot faster! On Skylake graphics the performance is 1.2x to 1.7x faster for Unreal Engine's Elemental Demo and up to 3.7x faster in other CS-heavy games/demos depending upon the GPU. Broadwell, Haswell, and Ivy Bridge have also seen other significant performance improvements too, particularly under Unreal Engine 4 for testing. Details in that commit.
The RadeonSI patch enables scratch coalescing when used in conjuction with LLVM 3.9 SVN code. The patch by Marek Olšák notes, "This makes one particular compute shader 8x faster."
Unfortunately, the details on that particular compute shader weren't shared nor the results from any other popular games relying upon CS.
Separately, the Intel Mesa driver also received a commit that boosts their compute shader performance. The Intel patch is about properly using the correct number of threads per compute shader. It turns out up to now they were using the wrong count for the number of available threads available on the GPU.
With now being able to use all the threads available for compute shaders, the CS performance is a heck of a lot faster! On Skylake graphics the performance is 1.2x to 1.7x faster for Unreal Engine's Elemental Demo and up to 3.7x faster in other CS-heavy games/demos depending upon the GPU. Broadwell, Haswell, and Ivy Bridge have also seen other significant performance improvements too, particularly under Unreal Engine 4 for testing. Details in that commit.
8 Comments