INTEL_DEBUG=shader_time reports that almost 0 time is spent in the vertex shader in a microbenchmark (OglTerrainFlyInst). However, the FS is really simple, and the VS is pretty large. I managed to cut roughly 50% of the VS instructions, and that roughly doubled the performance. So clearly the VS is important. Using performance monitoring, it looks like the VS actually takes up ~70% of the time. So, shader_time is just full of lies. It appears that it's undercounting the VS time in Unigine Valley as well. I believe the timestamp register is getting reset in almost every VS, so we just count 0 most of the time. Perhaps it can be improved? Perhaps we need to develop something better?
*facepalm* http://lists.freedesktop.org/archives/mesa-dev/2014-June/061465.html
This should pretty much be fixed by: commit d0575d98fc595dcc17706dc73d1eb461027ca17a Author: Kenneth Graunke <kenneth@whitecape.org> Date: Sat Jun 14 03:53:07 2014 -0700 i965/vec4: Fix dead code elimination for VGRFs of size > 1. When faced with code such as: mov vgrf31.0:UD, 960D mov vgrf31.1:UD, vgrf30.xxxx:UD The dead code eliminator didn't consider reg_offsets, so it decided that the second instruction was writing was writing to the same register as the first one, and eliminated the first one. But they're actually different registers. This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above code, vgrf31.0 represents the offset into the shader_time buffer where the data should be written, and vgrf31.1 represents the actual time data. With a completely undefined offset, results were...unexpected. I think this is probably one of the few cases (maybe only case) where we generate multiple MOVs to a large VGRF. Normally, we just use them as texturing results; the other SEND-from-GRF uses a size 1 VGRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org It's at least now giving me data in the same ballpark as the other methods, though I haven't checked if it's entirely the same. Thankfully it was something simple.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.