Bug 53256 - opt_dead_code_local removing assignments to array elements
Summary: opt_dead_code_local removing assignments to array elements
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: glsl-compiler (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 67224
  Show dependency treegraph
 
Reported: 2012-08-08 16:09 UTC by Chris Wolfe
Modified: 2013-08-08 00:42 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Chris Wolfe 2012-08-08 16:09:36 UTC
Believe I've run into a GLSL optimizer bug introduced by "glsl: Improve the local dead code optimization to eliminate unused channels" (05c200b). Have whacked together a quick workaround in Chromium OS, but would also like to get your thoughts about an actual solution. Tested with Chrome using mostly-unpatched Mesa git (82cd9c0 + portage).

Given GLSL source for a WebGL vertex shader, as logged by Mesa:

attribute vec3 aVertexPosition;
uniform mat4 uMVMatrix;
uniform mat4 uPMatrix;
void main(){
vec4 tmp;
(tmp[3] = 1.0);
(tmp.xyz = aVertexPosition);
(gl_Position = ((uPMatrix * uMVMatrix) * tmp));
}

Chromium does some mangling to get that exact source; believe only the 'tmp[3] = ...' and 'tmp.xyz = ...' matter.

The opt_dead_code_local tracing yields:
...
(assign  (x) (array_ref (var_ref tmp) (constant int (3)) )  (var_ref assignment_tmp) ) 
kill assignment_tmp (0x1 - 0xffffffff)
add tmp
current entries
    tmp (0x1)
(declare (temporary ) vec3 assignment_tmp)
(assign  (xyz) (var_ref tmp)  (swiz xyz (swiz xyz (var_ref aVertexPosition) ))) 
looking for tmp.0x7 to remove
tmp 0x1 - 0x1 = 0x0
rewriting:
  (assign  (x) (array_ref (var_ref tmp) (constant int (3)) )  (var_ref assignment_tmp) ) 
add tmp
current entries
    tmp (0x7)
...

Because tmp[3] = 1.0 is encoded as an assign with write_mask (x), and this mask is copied into the assignment_entry, it gets blown away by the assignment to (xyz). Expect the same problem will also happen if the 3 is a variable.

As a quick fix in CrOS, I am only adding dereference_variables to the assignment list <https://gerrit.chromium.org/gerrit/#/c/29531/3/media-libs/mesa/files/8.1-dead-code-local-hack.patch>. That's not a long-term fix, but I think it's safest (and fixes the regression). If there's obvious problems with that hack, please yell.

For a real fix, I think we need to generate entry->available more carefully, and use it rather than entry->ir->write_mask. e.g. set 1<<index for small constant array index and ~0 for variable array index (or just ~0 and get it on the next iteration). Feels like that could allow the elimination to work on vectors and most arrays (possibly records; haven't looked yet). Those extra cases make the creation of the assignment_entry a little complex, so I propose generating 'available' in "Add this instruction to the assignment list" and passing it to the constructor explicitly.

If that sounds good I can plow ahead with a patch. Also no objection to leaving it for Eric or company.
Comment 1 Chris Wolfe 2012-08-09 15:41:30 UTC
Re-aiming at the list, guess the CC isn't automatic. Please see the bug for context.

Summary: GLSL optimizer mangles "vec4 tmp; tmp[3] = 1.0; tmp.xyz = some_3d_vector;"

Walked through the optimization behavior in opt_dead_local_code (and friends) again. Feels like the best fix is to consider assignment to a deference_array of a vector as a write to an unknown component.

That way the assignment will get removed only if there is an overriding whole_variable_written to the vector. If the array index is constant it will get converted to a swizzle (vec_index_to_swizzle), and hopefully become a deference_variable for the next pass (haven't found the code for that yet).
Comment 2 Ian Romanick 2013-01-30 01:20:31 UTC
My current guess is the optimization pass is happening before lowering tmp[3] to tmp.w.  I'll look into it.
Comment 3 Ian Romanick 2013-08-08 00:42:41 UTC
I just committed some piglit tests that I wrote back in January.  Since the ir_triop_vector_insert / ir_binop_vector_extract work, all of the piglit tests pass.  I believe this issue is fixed as of (approximately) a61a0dbe.

If the issue persists, please re-open the bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.