My game needs to stream everything to the gpu. So all three buffers (ARRAY_BUFFER, ELEMENT_ARRAY_BUFFER, UNIFORM_BUFFER) will be updated before drawing. To get useable performance, I map each buffer with MAP_UNSYNC and update only small parts in a ringbuffer manner. If a buffer is full, I orphan it by BufferData(NULL, GL_STREAM_DATA). But my glsl shaders seem to be much slower on using uniform buffer objects. I've two codepath for uniforms: First uses uniform buffers, the second one updates all uniforms by glUniform. So all next steps are done once with uniform buffers and once with glUniform. All of the next files are stored on: http://markus.members.selfnet.de/i965-ubo/ For profiling, I've made an apitrace dumps. qapitrace profiler shows that I am gpu bottlenecked almost all the time. I've also dumped INTEL_DEBUG=wm,shader_time of both "glretrace -b". To be complete, there is also the intel_gpu_top output. I think the qapitrace output says that all shaders are slower, so it shouldn't be an issue with one of them. Maybe the optimizion fails for ubo uniforms? My test environment: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz HD4000 GPU 3.0 Mesa 9.2-devel (git-8cabe26)
The patch series I just sent out (also available as the "ubo" branch of git://people.freedesktop.org/~anholt/mesa) fixes some rendering failures with your trace on my ivb while improving performance 20%. Unfortunately, your non-ubo trace spewed endless errors about uniform updates (have you checked for GL errors from your app? Did the replay play back cleanly for you?), so I couldn't compare the two side by side.
It is fixed by these patches: http://lists.freedesktop.org/archives/mesa-dev/2013-March/035804.html
commit 4c1fdae0a01b3f92ec03b61aac1d3df500d51fc6 Author: Eric Anholt <eric@anholt.net> Date: Wed Mar 6 14:47:22 2013 -0800 i965/fs: Switch to using sampler LD messages for uniform pull constants. When forcing the compiler to always generate pull constants instead of push constants (in order to have an easy to use testcase), improves performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.