Intel's OpenGL Mesa Driver To Better Handle Recovery In Case Of GPU Hangs

Written by Michael Larabel in Intel on 15 February 2019 at 02:34 AM EST. 5 Comments
INTEL
It's sure been a busy week in the Intel open-source graphics driver space... The latest improvement is a patch series providing better context restoration in the case of GPU hangs.

Chris Wilson who usually deals with the Intel DRM kernel driver, including on the reset/restore front recently, sent out a set of two patches for improving the Intel i965 Mesa driver's behavior following GPU hangs.

He describes the change in behavior quite elegantly, so here's that from the patch series:
The kernel tries to repair a hanging context by restoring the default state (or else we have discovered that the context may be unusably corrupt by the reset). However, this is unsuitable for mesa as it (rightfully) assumes that the context image contains the state it has earlier set and so only emits incremental changes (e.g. it setups up register invariants once at context creation and thenceforth never has to touch those registers again). If we overwrite mesa's state with the default state, that too can lead to nasty hangs. Lose-lose. An alternative is that we tell the kernel not to bother trying to recover from the hang, but report back to userspace that the context is lost immediately and leave it to mesa to do its own recovery.

This patch provides the minimum that should do the trick, but there's probably a lot of stones that need to turned over to find the residual state that isn't being reset (since BRW_NEW_CONTEXT went out of fashion ;) I'd like to get this concept acked so that we can land the corresponding kernel patch and make the uAPI official and start putting it to use.

It's only a few dozen lines of code to make the Intel OpenGL driver more resilient following GPU hangs. This code should make it into the Mesa 19.1 release due out next quarter.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week