Bug 57121 - [snb] corrupted GLSL built-in function results when using Uniform Buffer contents as arguments
[snb] corrupted GLSL built-in function results when using Uniform Buffer cont...
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
All Linux (All)
: medium normal
Assigned To: Eric Anholt
Depends on:
  Show dependency treegraph
Reported: 2012-11-14 16:27 UTC by Tomasz Kaźmierczak
Modified: 2013-02-22 21:11 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:

The expected result (41.88 KB, image/png)
2012-11-14 16:27 UTC, Tomasz Kaźmierczak
The actual result (the bug) (136.14 KB, image/png)
2012-11-14 16:28 UTC, Tomasz Kaźmierczak
apitrace file (289.79 KB, application/octet-stream)
2012-11-20 21:58 UTC, Tomasz Kaźmierczak
a small test program (4.48 KB, application/octet-stream)
2012-11-21 14:50 UTC, Tomasz Kaźmierczak
The expected result (7.52 KB, image/png)
2012-11-21 14:51 UTC, Tomasz Kaźmierczak
The actual result (the bug) (16.39 KB, image/png)
2012-11-21 14:51 UTC, Tomasz Kaźmierczak
INTEL_DEBUG=wm (6.00 KB, text/plain)
2013-02-15 07:10 UTC, Markus Wick

Note You need to log in before you can comment on or make changes to this bug.
Description Tomasz Kaźmierczak 2012-11-14 16:27:00 UTC
In a shader code when I use, for example, the pow() function with uniforms as arguments, everything is ok if the uniform resides in the default uniform block (value passed from the client code using the glUniform() function). However, if the uniform is a part of a named uniform block (comes from a uniform buffer object, UBO), the result of the pow() function is corrupted.

Please see the screenshots attached to the report. In my code I use the pow() function for calclulating specular reflection intensity:
float reflectionIntensity = u_SpecularIntensity * pow(reflectionAngle, u_SpecularHardness);
(the u_SpecularIntensity is a parameter of the material, just as the u_SpecularHardness; reflectionAngle a local variable)

When both u_SpecularIntensity and u_SpecularHardness are defined inside a named uniform block and their contents come from a UBO, the data corruption can be observed - the result is shown on uniform_from_UBO.png (the undesired effect).
But when the u_SpecularHardness is just a "normal" uniform variable and it's value is set using a glUniform() function, everything is ok (see the normal_uniform.png file).

It doesn't matter whether the other uniform variable (u_SpecularIntensity) comes from UBO or not - it never causes any rendering issues. This means that the problem is only when a UBO uniform variable is used as an argument of a built-in function (adding, subtracting and multiplying UBO variables seems to work fine).

In order to check whether this is only an issue of pow(), I've also checked normalize(). The easiest way was to normalize a diffuse color (a vec4). The results were similar as in case of pow() (and again, when normalizing a diffuse color that is passed to the shader using the glUniform() function, everything is ok).

I've also found out that when I add some value to the u_SpecularHardness, store the result in a local variable and use that local variable in pow(), then the corruption is gone (simply assigning the u_SpecularHardness to a local variable is not enough - it seems like the compiler optimizes-out the variable in such case and refers directly to the uniform buffer).

Where it happens?

I've noticed this on a system with Intel HD Graphics 3000 (Sandy Bridge) GPU. I'm not sure whether this is specific to Mesa in general or to the graphics driver (lsmod says that the driver in use is the i915), therefore I've selected "Other" as the Mesa component.

On a system with an AMD GPU and the proprietary fglrx driver the bug doesn't exist, so I'm sure that it's not a problem with my OpenGL client code. I've also ruled out the possibility that it could be caused by impropper alignment of data inside the buffer (this happens even if both the uniform buffer and the uniform block contain only one floating point variable, and I use the std140 layout for all of my uniform blocks).

In my GLSL code I use the following two directives at the beginning of all files:
#version 130
#extension GL_ARB_uniform_buffer_object : require

It happens in both vertex and fragment shaders.

Some Mesa information:
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) Sandybridge Mobile 
OpenGL version string: 3.0 Mesa 9.0
OpenGL shading language version string: 1.30

As mentioned above, the driver in use is i915.
Comment 1 Tomasz Kaźmierczak 2012-11-14 16:27:57 UTC
Created attachment 70079 [details]
The expected result
Comment 2 Tomasz Kaźmierczak 2012-11-14 16:28:36 UTC
Created attachment 70080 [details]
The actual result (the bug)
Comment 3 Eric Anholt 2012-11-20 00:59:12 UTC
Do you have any way to get a small program that shows the issue?

Is the use of pow() required?  How about if you just read the value of one of your uniform values?
Comment 4 Eric Anholt 2012-11-20 00:59:52 UTC
(If you don't have a small program to demo the bug, then using apitrace would be a great way to get a way to show the problem)
Comment 5 Tomasz Kaźmierczak 2012-11-20 21:57:22 UTC
Hi, I'll try to create a small test program for that if I find some time, but for now I'm sending an apitrace file.

The meshes get loaded in frame 137 and the bug is visible then (they get transformed to their propper positions in the next frame - It's just my app's behaviour, something I still haven't fixed).

I didn't mention that, just to not complicate things, but I actually use the pow() function twice - for specular reflections and for gamma correction (the gamma value also comes from a UBO and using it as an argument for pow() magnifies the problem).

It is not required to use pow() function in order to reproduce the bug - you can also use normalize() on a vector that comes from a UBO (eg. on a diffuse color), and probably other built-in functions would behave similarly (haven't checked that). However, I only checked that on uniforms that provide color information, and I guess the problem is best visible then.

Some comments about the trace:
- the trace comes from a program which is a Qt GUI application that uses QGLWidget - don't know whether this has any influence on the bug (eg. the GL context creation or something)

- When you replay the trace, you can notice that some pixels on the meshes flicker; this, of course, doesn't happen when the u_SpecularHardness and u_InverseGamma come from the default uniform block

- I've looked at the state of the uniforms inside apitrace, and their values are weird: u_DiffuseColor and u_SpecularColor are equal to u_LightPosition (this uniform comes from glUniform() function, not from a UBO), the other material uniforms all have value of 50, which is equal to u_LightPosition.x; similar is with u_CameraPosition and the matrices they all display the same values (all near 0); generally, only the light-related uniforms have propper values (all the light uniforms come from the default uniform block). With such values not many things would be rendered properly, so the apitrace shows wrong UBO uniform values.
Comment 6 Tomasz Kaźmierczak 2012-11-20 21:58:28 UTC
Created attachment 70334 [details]
apitrace file
Comment 7 Tomasz Kaźmierczak 2012-11-21 14:49:41 UTC
Hi again.

Adding a small test program (one c++ source and two shader sources). In order to compile, just call:
$ g++ main.cpp -o gl3test -lX11 -lGL

It draws a triangle whose color is set directly in a fragment shader, and then the gamma correction is applied to the color. The gamma factor comes from a UBO and is an argument to the pow() function used to calculate the gamma correction. I've tried to keep it as simple as possible, so even the vertex transform matrix is hardcoded into the vertex shader.

Also attaching new images with the error and the expected result (the expected result has been produced by hardcoding the gamma value into the fragment shader).

If you look into the fragment shader source (simple.frag) you can see one line commented out (line 18). If you use line 18 instead of line 16, then the problem is gone, so this demonstrates that the uniform itself contains propper value - it's just the pow() function that reads from somewhere else rather than from the uniform (as mentioned earlier, other built-in functions have the same problem, and it happens only for uniforms that come from UBOs).

The source code is based on code copied from here:

I've added three functions: readShader(), linkProgram() and doTests().
Besides that, I've added the #define GL_GLEXT_PROTOTYPES directive at the begining of the file, so that I didn't have to load the function pointers manually.

The program draws 5 frames (one frame per second) and exits.
Comment 8 Tomasz Kaźmierczak 2012-11-21 14:50:39 UTC
Created attachment 70373 [details]
a small test program
Comment 9 Tomasz Kaźmierczak 2012-11-21 14:51:08 UTC
Created attachment 70374 [details]
The expected result
Comment 10 Tomasz Kaźmierczak 2012-11-21 14:51:37 UTC
Created attachment 70375 [details]
The actual result (the bug)
Comment 11 Tomasz Kaźmierczak 2012-11-29 11:24:29 UTC
I've tried to check this also on a Radeon hardware (HD 5450), but apparently the uniform buffer objects aren't supported by the radeon driver at all.
Comment 12 Markus Wick 2013-02-15 07:07:44 UTC
I get exactly the same result here on my HD3000 and both with Mesa 9.0 and Mesa 9.2-devel (git-6dbe94c).

But on my HD4000, everything looks fine. So maybe only sandy bridge related.
Comment 13 Markus Wick 2013-02-15 07:10:24 UTC
Created attachment 74852 [details]

compiled shader for HD3000 on latest git
Comment 14 Eric Anholt 2013-02-16 06:36:32 UTC
The patch series I just sent out (also available as the "ubo" branch of git://people.freedesktop.org/~anholt/mesa) fixes some rendering failures on my ivb that could also appear on snb plus should fix some additional snb-specific rendering failures.  And it increases performance!  Unfortunately, I don't have snb with me in the car right now so I can't tell whether it fixes your problem at the moment, so give it a shot.
Comment 15 Scott Moreau 2013-02-17 08:16:08 UTC
I might be affected by this bug as well. I uploaded some screenshots from several different configurations of mesa using the Dolphin emulator to demonstrate the problem.

Shows correct rendering with the classic renderer.

Shows GLSL branch of dolphin, with UBO's enabled.
Here you can see the pixelation and lack of fog(?)

I tried anholt's ubo branch, d8d8a5f3b480.
This is also with the GLSL renderer, same results.

This is what happens when disabling UBO's
Still using the GLSL renderer, but it's much slower.

In a nutshell, disabling GL_ARB_uniform_buffer_object allows for correct GLSL rendering but not as fast.

Running Intel Sandybridge, Ubuntu 12.10 x86_64, kernel 3.5.0-24-generic
Comment 16 Markus Wick 2013-02-18 10:11:56 UTC
Both the bug and the output of INTEL_DEBUG=wm haven't changed.
Comment 17 Eric Anholt 2013-02-20 22:50:09 UTC
Thanks for the short trace, tomasz.  Found the bug (a debug build of mesa assertion fails on it), made a piglit test, and I'm waiting on this gm45 piglit run to finish before I swap to the snb to fix things.
Comment 18 Tomasz Kaźmierczak 2013-02-21 12:56:32 UTC
Is there any chance that the fix will make it into Mesa 9.1?
Comment 19 Scott Moreau 2013-02-22 00:16:28 UTC
The patch here fixes the problem in dolphin for me http://lists.freedesktop.org/archives/mesa-dev/2013-February/034936.html
Comment 20 Eric Anholt 2013-02-22 19:07:24 UTC
commit 7b0731d940c758ca9c1e883cdea454d8787255c1
Author: Eric Anholt <eric@anholt.net>
Date:   Wed Feb 20 18:00:47 2013 -0800

    i965/fs: Fix broken math on values loaded from uniform buffers on gen6.
Comment 21 Eric Anholt 2013-02-22 19:08:03 UTC
(since it's marked for stable, it'll end up in either 9.1 or 9.1.1)
Comment 22 Tomasz Kaźmierczak 2013-02-22 21:11:36 UTC
ok, thanks again for the fix