Bug 60866 - GLSL performance issues for uniform buffer objects
GLSL performance issues for uniform buffer objects
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
x86-64 (AMD64) Linux (All)
: medium normal
Assigned To: Eric Anholt
Depends on:
  Show dependency treegraph
Reported: 2013-02-15 01:05 UTC by Markus Wick
Modified: 2013-03-11 19:37 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Note You need to log in before you can comment on or make changes to this bug.
Description Markus Wick 2013-02-15 01:05:11 UTC
My game needs to stream everything to the gpu. So all three buffers (ARRAY_BUFFER, ELEMENT_ARRAY_BUFFER, UNIFORM_BUFFER) will be updated before drawing.
To get useable performance, I map each buffer with MAP_UNSYNC and update only small parts in a ringbuffer manner. If a buffer is full, I orphan it by BufferData(NULL, GL_STREAM_DATA).

But my glsl shaders seem to be much slower on using uniform buffer objects.
I've two codepath for uniforms: First uses uniform buffers, the second one updates all uniforms by glUniform. So all next steps are done once with uniform buffers and once with glUniform.

All of the next files are stored on: http://markus.members.selfnet.de/i965-ubo/

For profiling, I've made an apitrace dumps.
qapitrace profiler shows that I am gpu bottlenecked almost all the time.
I've also dumped INTEL_DEBUG=wm,shader_time of both "glretrace -b".
To be complete, there is also the intel_gpu_top output.

I think the qapitrace output says that all shaders are slower, so it shouldn't be an issue with one of them. Maybe the optimizion fails for ubo uniforms?

My test environment:
Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz
HD4000 GPU
3.0 Mesa 9.2-devel (git-8cabe26)
Comment 1 Eric Anholt 2013-02-16 06:40:18 UTC
The patch series I just sent out (also available as the "ubo" branch of git://people.freedesktop.org/~anholt/mesa) fixes some rendering failures with your trace on my ivb while improving performance 20%.  Unfortunately, your non-ubo trace spewed endless errors about uniform updates (have you checked for GL errors from your app?  Did the replay play back cleanly for you?), so I couldn't compare the two side by side.
Comment 2 Markus Wick 2013-03-07 07:59:55 UTC
It is fixed by these patches:
Comment 3 Eric Anholt 2013-03-11 19:37:12 UTC
commit 4c1fdae0a01b3f92ec03b61aac1d3df500d51fc6
Author: Eric Anholt <eric@anholt.net>
Date:   Wed Mar 6 14:47:22 2013 -0800

    i965/fs: Switch to using sampler LD messages for uniform pull constants.
    When forcing the compiler to always generate pull constants instead of
    push constants (in order to have an easy to use testcase), improves
    performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7).
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>