Bug 52129 - [Bisected ILK]Piglit spec_ARB_shader_texture_lod_execution_glsl-fs-shadow2DGradARB-01 regressed
[Bisected ILK]Piglit spec_ARB_shader_texture_lod_execution_glsl-fs-shadow2DGr...
Status: VERIFIED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
git
All Linux (All)
: high major
Assigned To: Kenneth Graunke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-16 02:16 UTC by lu hua
Modified: 2012-08-13 03:42 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description lu hua 2012-07-16 02:16:25 UTC
System Environment:
--------------------------
Arch:             x86_64
Platform:         Ironlake
Libdrm:		(master)libdrm-2.4.37-3-g992e2afd59539d169689bf21d78fac8b5cea4e3c
Mesa:		(master)1a06e8454ec714e950bc88882cd985534a18bf1f
Xserver:(master)xorg-server-1.12.0-233-gdae317e7265007b38012244722e3b3a06e904ed5
Xf86_video_intel:(master)2.19.0-449-gb5d6a57f12025aef9850c7d9baa6905f776be971
Libva:		(staging)f12f80371fb534e6bbf248586b3c17c298a31f4e
Libva_intel_driver:(staging)82fa52510a37ab645daaa3bb7091ff5096a20d0b
Kernel:	(drm-intel-next-queued) 1edc2c89df6cc1730cb2329fbecfe041b8dcc2e0

Bug detailed description:
------------------------------
It fails on ironlake with mesa master branch.It doesn't happen on 8.0 branch.

Following cases also fail and have the same bisect commit:
spec_ARB_shader_texture_lod_execution_glsl-fs-shadow2DGradARB-04
spec_ARB_shader_texture_lod_execution_glsl-fs-shadow2DGradARB-07
spec_ARB_shader_texture_lod_execution_glsl-fs-shadow2DGradARB-cumulative

Bisect shows: b0c8d3be73ea777e1fd5870c344afb1d31921411 is the first bad commit

commit b0c8d3be73ea777e1fd5870c344afb1d31921411
Author:     Kenneth Graunke <kenneth@whitecape.org>
AuthorDate: Mon Jul 9 11:26:47 2012 -0700
Commit:     Kenneth Graunke <kenneth@whitecape.org>
CommitDate: Thu Jul 12 10:20:23 2012 -0700

    i965: Add a lowering pass to convert TXD to TXL by computing the LOD.

    Intel hardware doesn't natively support textureGrad with shadow
    comparisons.  So we need to generate code to handle it somehow.

    Based on the equations of page 205 of the OpenGL 3.0 specification,
    it's possible to compute the LOD value that would be selected given the
    gradient values.  Then, we can simply convert the TXD to a TXL.

    Currently, this passes 34/46 of oglconform's shadow-grad subtests;
    four cubemap tests are regressed.  We should investigate this in the
    future.

    v2: Apply abs() to the scalar case (thanks to Eric).

    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>

Reproduce steps:
----------------------------
1. start x
2. ./bin/shader_runner tests/spec/arb_shader_texture_lod/execution/glsl-fs-shadow2DGradARB-01.shader_test -auto -fbo
Comment 1 Kenneth Graunke 2012-08-05 02:48:03 UTC
I've figured out the problem...and my goodness, it's an awful one.

In emit_texture_gen5().

1. We begin setting up the sampler message by loading the texcoords into the message registers, starting at MRF 2.  (We also load the shadow comparison.)
2. We then realize we need the LOD, and call ir->lod_info.lod->accept(this) to generate code for the expression tree representing it.

My lowering pass generates a tree that looks like log2(sqrt(...)).  On Gen4-5, these two math operations are performed by sending a message to the extended math box.  So we load the operand into the message registers...starting with MRF 2.

3. We then happily generate the sampler SEND, assuming that all of our data is in the message registers.  Except that the math operations just clobbered our U-coordinate.

In other words, if the LOD or shadow comparison contain any expressions that involve message sends, the texture call could break horribly.  On Gen4-5, this includes math, but in general, it could result from nested texture calls, UBO loads, and so on.

I'm honestly shocked we haven't run into this before.
Comment 2 Kenneth Graunke 2012-08-05 07:07:28 UTC
Fixed in the 'texture-mrf-clobber' branch of ~kwg/mesa.  Alternatively, you can get the patch series from the mailing list:
http://lists.freedesktop.org/archives/mesa-dev/2012-August/025174.html

Thanks for the report, this caught a real nasty bug!
Comment 3 Kenneth Graunke 2012-08-06 23:16:44 UTC
Fixed in master by the following series:

commit 54c045b93cd205bcf031e70d65238d60bfc07da4
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Aug 4 21:28:23 2012 -0700

    i965/vs: Don't clobber sampler message MRFs with subexpressions.
    
    See the preceding commit for a description of the problem.
    
    NOTE: This is a candidate for stable release branches.
    
    v2: Use a separate dPdx variable rather than reusing the lod src_reg.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>

commit c0f60106df724188d6ffe7c9f21eeff22186ab25
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Aug 4 20:55:21 2012 -0700

    i965/fs: Don't clobber sampler message MRFs with subexpressions.
    
    Consider a texture call such as:
    
       textureLod(s, coordinate, log2(...))
    
    First, we begin setting up the sampler message by loading the texture
    coordinates into MRFs, starting with m2.  Then, we realize we need the
    LOD, and go to compute it with:
    
       ir->lod_info.lod->accept(this);
    
    On Gen4-5, this will generate a SEND instruction to compute log2(),
    loading the operand into m2, and clobbering our texcoord.
    
    Similar issues exist on Gen6+.  For example, nested texture calls:
    
      textureLod(s1, c1, texture(s2, c2).x)
    
    Any texturing call where evaluating the subexpression trees for LOD or
    shadow comparitor would generate SEND instructions could potentially
    break.  In some cases (like register spilling), we get lucky and avoid
    the issue by using non-overlapping MRF regions.  But we shouldn't count
    on that.
    
    Fixes four Piglit test regressions on Gen4-5:
    - glsl-fs-shadow2DGradARB-{01,04,07,cumulative}
    
    NOTE: This is a candidate for stable release branches.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>

commit 27bf9c1997b77f85c2099436e9ad5dfc0f1608c7
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Aug 4 20:40:42 2012 -0700

    i965/fs: Factor out texcoord setup into a helper function.
    
    With the textureRect support and GL_CLAMP workarounds, it's grown
    sufficiently that it deserves its own function.  Separating it out
    makes the original function much more readable.
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>

commit 82bfb4b41af7d61aa45e41d62c1842b6a09e9585
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Aug 4 20:33:13 2012 -0700

    i965/fs: Move message header and texture offset setup to generate_tex().
    
    Setting the texture offset bits in the message header involves very
    specific hardware register descriptions.  As such, I feel it's better
    suited for the lower level "generate" layer that has direct access to
    the weird register layouts, rather than at the fs_inst abstraction layer.
    
    This also parallels the approach I took in the VS backend.
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>
Comment 4 lu hua 2012-08-13 03:42:27 UTC
Verified. Fixed on commit (master)6cb9e99a757bd5a9d908ed6c5515a9ae5fb041ba