
Originally Posted by
Kayden
We've actually been meaning to support GL_AMD_performance_monitor for some time now. I started on it a while back, but got busy with other things and never finished it.
We do have a few things: intel-gpu-top shows how busy the GPU is, a breakdown by unit, render vs. blitter usage, and basic counters like VS/PS invocations. However, the 'top' style interface is not always the most useful, since it doesn't allow you to capture data over time. We also have a small proof of concept program called 'chaps' in intel-gpu-tools which exposes Ironlake's MI_REPORT_PERF counters. Ultimately, the idea is to expose those via GL_AMD_performance_monitor. We currently don't have that for Sandybridge/Ivybridge though, sadly. As Michael mentioned, there are some hoops to jump through, but I think once someone writes the code that should be doable.
Another new tool is Eric's INTEL_DEBUG=shader_time, which shows a breakdown of how many clock cycles were used by each vertex shader, 8-wide fragment shader, and 16-wide fragment shader. It's extremely useful for determining which shaders are the most expensive (sometimes large shaders are seldom used, while smaller shaders are used extremely frequently, so guessing doesn't always work). That allows us to focus our optimization efforts. (Sadly this is Ivybridge only, since the timestamp register didn't exist prior to that.)
But overall I agree, we need more performance counters, and need to expose them to application developers.