LCZero Chess Engine Performance With OpenCL vs. CUDA + cuDNN vs. FP16 With Tensor Cores

Written by Michael Larabel in NVIDIA on 14 January 2019 at 03:33 AM EST. 11 Comments
NVIDIA
A Phoronix reader pointed out LCZero (Leela Chess Zero) a few days ago as an interesting chess engine powered by neural networks and supports BLAS, OpenCL, and NVIDIA CUDA+cuDNN back-ends. Particularly with the FP16 cuDNN support, this chess engine can be super fast on NVIDIA's latest Turing GPUs with tensor cores.

With LCZero's build process being sane for its different back-ends and the program turning out to be benchmark-friendly and meeting my requirements, it's now available via the Phoronix Test Suite with a simple phoronix-test-suite benchmark lczero (granted, the back-end support may obviously vary depending upon your hardware/driver support) and more details over on OpenBenchmarking.org.
lczero-gpus

Given its back-end coverage, I set out this weekend testing up various Maxwell/Pascal/Turing GPUs I had available. Here are those initial numbers. Tests compared to Radeon GPUs with OpenCL will be coming in the next few days.
lczero-gpus

With the OpenCL back-end for LCZero, the GeForce RTX Turing GPUs were already performing quite well in relation to the GeForce 900 Maxwell and GeForce 1000 Pascal graphics cards.
lczero-gpus

When switching over to the CUDA + cuDNN back-end, the performance for all of the GPUs at least doubled in comparison to the OpenCL back-end. In the case of the TITAN RTX, its performance was 2.35x the OpenCL back-end.
lczero-gpus

The CUDA FP16 back-end was only working for the Turing GPUs, but there even the data speaks volumes. The TITAN RTX was 2.2x the speed of the conventional CUDA back-end compared to this FP16 support that was able to utilize Turing's tensor cores. In the case of the RTX 2060, the performance was nearly 2.9x the speed of the standard CUDA back-end or 6.7x in relation to the OpenCL back-end.

More benchmarks of LCZero will be coming up in some future articles on Phoronix with this OpenCL/CUDA/BLAS benchmark now being available via the Phoronix Test Suite.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week