Bug 91114

Summary:	ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails
Product:	Mesa	Reporter:	Tapani Pälli <lemody>
Component:	Drivers/DRI/i965	Assignee:	Tapani Pälli <lemody>
Status:	RESOLVED FIXED	QA Contact:	Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity:	normal
Priority:	medium
Version:	git
Hardware:	Other
OS:	All
Whiteboard:
i915 platform:		i915 features:
Attachments:	patch to fix the issue ir output before commit ir output after commit GLSL sample implementaton of compute LOD for textureGradient of cubemaps LOD compute for textureCube on samplerCube

Description Tapani Pälli 2015-06-26 09:56:09 UTC

bisected to:
--- 8< ---
2b1cdb0eddb73f62e4848d4b64840067f1f70865 is the first bad commit
commit 2b1cdb0eddb73f62e4848d4b64840067f1f70865
Author: Iago Toral Quiroga <itoral@igalia.com>
Date:   Tue Feb 24 19:02:50 2015 +0100

    i965: Fix textureGrad with cube samplers
    
    We can't use sampler messages with gradient information (like
    sample_g or sample_d) to deal with this scenario because according
    to the PRM:
    
    "The r coordinate and its gradients are required only for surface
    types that use the third coordinate. Usage of this message type on
    cube surfaces assumes that the u, v, and gradients have already been
    transformed onto the appropriate face, but still in [-1,+1] range.
    The r coordinate contains the faceid, and the r gradients are ignored
    by hardware."
    
    Instead, we should lower this to compute the LOD manually based on the
    gradients and use a different sample message that takes the computed
    LOD instead of the gradients. This is already being done in
    brw_lower_texture_gradients.cpp, but it is restricted to shadow
    samplers only, although there is a comment stating that we should
    probably do this also for samplerCube and samplerCubeArray.
    
    Because of this, both dEQP and Piglit test cases for textureGrad with
    cube maps currently fail.
    
    This patch does two things:
    1) Activates the texturegrad lowering pass for all cube samplers.
    2) Corrects the computation of the LOD value for cube samplers.
    
    I had to do 2) because for cube maps the calculations implemented
    in the lowering pass always compute a value of rho that is twice
    the value we want (so we get a LOD value one unit larger than we
    want). This only happens for cube map samplers (all kinds). I am
    not sure about why we need to do this, but I suspect that it is
    related to the fact that cube map coordinates, when transported
    to a specific face in the cube, are in the range [-1, 1] instead of
    [0, 1] so we probably need to divide the derivatives by 2 when
    we compute the LOD. Doing that would produce the same result as
    dividing the final rho computation by 2 (or removing a unit
    from the computed LOD, which is what we are doing here).
    
    Fixes the following piglit tests:
    bin/tex-miplevel-selection textureGrad Cube -auto -fbo
    bin/tex-miplevel-selection textureGrad CubeArray -auto -fbo
    bin/tex-miplevel-selection textureGrad CubeShadow -auto -fbo
    
    Fixes 10 dEQP tests in the following category:
    dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*
    
    Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Comment 1 Tapani Pälli 2015-06-26 10:09:32 UTC

(I've reproduced this on SKL and HSW)

Comment 2 Iago Toral 2015-06-26 10:17:29 UTC

These are the conformance tests, right? I don't think I have access to them. 

Tapani, can you check if this change inlower_texture_grad_visitor::visit_leave(ir_texture *ir) fixes the problem?:

-   if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
-      ir->lod_info.lod = expr(ir_binop_add,
-                              expr(ir_unop_log2, rho),
-                              new(mem_ctx) ir_constant(-1.0f));
-   } else {
       ir->lod_info.lod = expr(ir_unop_log2, rho);
-   }

The special case for GLSL_SAMPLER_DIM_CUBE was needed for piglit and deqp textureGrad tests to pass, but if this is breaking a conformance test then  maybe there is some other factor to consider when adjusting the lod computation.

Comment 3 Tapani Pälli 2015-06-26 10:22:27 UTC

(In reply to Iago Toral from comment #2)
> These are the conformance tests, right? I don't think I have access to them. 

Yep, this is gles3 conformance.
 
> Tapani, can you check if this change
> inlower_texture_grad_visitor::visit_leave(ir_texture *ir) fixes the problem?:
> 
> -   if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
> -      ir->lod_info.lod = expr(ir_binop_add,
> -                              expr(ir_unop_log2, rho),
> -                              new(mem_ctx) ir_constant(-1.0f));
> -   } else {
>        ir->lod_info.lod = expr(ir_unop_log2, rho);
> -   }
> 
> The special case for GLSL_SAMPLER_DIM_CUBE was needed for piglit and deqp
> textureGrad tests to pass, but if this is breaking a conformance test then 
> maybe there is some other factor to consider when adjusting the lod
> computation.

unfortunately did not fix the issue

Comment 4 Tapani Pälli 2015-06-26 10:27:19 UTC

(In reply to Tapani Pälli from comment #3)
> (In reply to Iago Toral from comment #2)
> > These are the conformance tests, right? I don't think I have access to them. 
> 
> Yep, this is gles3 conformance.
>  
> > Tapani, can you check if this change
> > inlower_texture_grad_visitor::visit_leave(ir_texture *ir) fixes the problem?:
> > 
> > -   if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
> > -      ir->lod_info.lod = expr(ir_binop_add,
> > -                              expr(ir_unop_log2, rho),
> > -                              new(mem_ctx) ir_constant(-1.0f));
> > -   } else {
> >        ir->lod_info.lod = expr(ir_unop_log2, rho);
> > -   }
> > 
> > The special case for GLSL_SAMPLER_DIM_CUBE was needed for piglit and deqp
> > textureGrad tests to pass, but if this is breaking a conformance test then 
> > maybe there is some other factor to consider when adjusting the lod
> > computation.
> 
> unfortunately did not fix the issue

huh yeah, actually it does when I added:

ir->lod_info.lod = expr(ir_unop_log2, rho);

(taken away by the patch)

Comment 5 Tapani Pälli 2015-06-26 10:30:59 UTC

Created attachment 116726 [details] [review]
patch to fix the issue

here's a patch that fixes this issue

Comment 6 Tapani Pälli 2015-06-26 10:47:55 UTC

(In reply to Tapani Pälli from comment #5)
> Created attachment 116726 [details] [review] [review]
> patch to fix the issue
> 
> here's a patch that fixes this issue

Iago, can you take a further look with help of this patch? Or if you think this is correct as is, let me know why :)

Comment 7 Iago Toral 2015-06-26 11:17:14 UTC

(In reply to Tapani Pälli from comment #6)
> (In reply to Tapani Pälli from comment #5)
> > Created attachment 116726 [details] [review] [review] [review]
> > patch to fix the issue
> > 
> > here's a patch that fixes this issue
> 
> Iago, can you take a further look with help of this patch? Or if you think
> this is correct as is, let me know why :)

So the issue only (In reply to Tapani Pälli from comment #6)
> (In reply to Tapani Pälli from comment #5)
> > Created attachment 116726 [details] [review] [review] [review]
> > patch to fix the issue
> > 
> > here's a patch that fixes this issue
> 
> Iago, can you take a further look with help of this patch? Or if you think
> this is correct as is, let me know why :)

This patch breaks:
bin/tex-miplevel-selection textureGrad CubeShadow -auto -fbo

do you know if the bad commit breaks other CTS tests with textureGrad on cubes? I am wondering if the problem here might be that piglit and deqp both have a different expectation than the CTS.

Comment 8 Tapani Pälli 2015-06-26 11:37:37 UTC

(In reply to Iago Toral from comment #7)
> do you know if the bad commit breaks other CTS tests with textureGrad on
> cubes? I am wondering if the problem here might be that piglit and deqp both
> have a different expectation than the CTS.

Nope just did a full run and the only failing tests are:

ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert
ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag

Comment 9 Iago Toral 2015-06-26 12:01:08 UTC

(In reply to Tapani Pälli from comment #8)
> (In reply to Iago Toral from comment #7)
> > do you know if the bad commit breaks other CTS tests with textureGrad on
> > cubes? I am wondering if the problem here might be that piglit and deqp both
> > have a different expectation than the CTS.
> 
> Nope just did a full run and the only failing tests are:
> 
> ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert
> ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag

Ok, in that case it looks as if the CTS expectations for textureGrad on shadow cube samplers specifically are different from those in piglit/deqp.

I have just run bin/tex-miplevel-selection textureGrad CubeShadow with the proprietary nVidia driver  and it passes, so I guess that points to piglit's expectations being correct here. Could it be that the CTS test is not correct? I find it difficult to believe that piglit/deqp and the nVidia implementation are all incorrect.

That said, I wonder if there is something special about this CTS test that makes it different from the piglit tests so as to need a different LOD computation.

Comment 10 Tapani Pälli 2015-06-29 08:23:21 UTC

Created attachment 116781 [details]
ir output before commit

Comment 11 Tapani Pälli 2015-06-29 08:24:03 UTC

Created attachment 116782 [details]
ir output after commit

not sure if this helps but here's IR taken with INTEL_DEBUG=vs,fs before commit (when test passes) and after (when it fails)

Comment 12 Iago Toral 2015-06-29 09:27:48 UTC

I don't see anything special in the generated IR. After my patch we change the way we compute the lod by subtracting 1 unit, but that is expected as per the commit log. The test itself does not seem to do weird things either after a quick glance at it.

I am wondering if the problem may spawn from rounding issues so that depending on the specific test we are doing sometimes we fall on the wrong side of the rounding. While I was working on deqp I noticed that the problem was that we were computing a wrong value of LOD in all the scenarios and verified that subtracting 1.0 from the computed LOD fixed the problem, but as I explain in the commit log, I am not sure why this is needed. That change seemed consistent with the way in which deqp computed the expected result for those tests but maybe we do not need to subtract 1.0 to get the lod we need, maybe the problem that deqp detected was a rounding issue and by removing 1.0 we just fell on the right side of the rounding for those tests. I have just checked that for piglit, it is enough if we subtract 0.51.

Tapani, could you replace my -1.0 addition by -0.51 and see if that fixes the CTS test? In parallel, I'll check if this is also enough for dEQP.

Comment 13 Iago Toral 2015-06-29 09:51:47 UTC

(In reply to Iago Toral from comment #12)
> I don't see anything special in the generated IR. After my patch we change
> the way we compute the lod by subtracting 1 unit, but that is expected as
> per the commit log. The test itself does not seem to do weird things either
> after a quick glance at it.
> 
> I am wondering if the problem may spawn from rounding issues so that
> depending on the specific test we are doing sometimes we fall on the wrong
> side of the rounding. While I was working on deqp I noticed that the problem
> was that we were computing a wrong value of LOD in all the scenarios and
> verified that subtracting 1.0 from the computed LOD fixed the problem, but
> as I explain in the commit log, I am not sure why this is needed. That
> change seemed consistent with the way in which deqp computed the expected
> result for those tests but maybe we do not need to subtract 1.0 to get the
> lod we need, maybe the problem that deqp detected was a rounding issue and
> by removing 1.0 we just fell on the right side of the rounding for those
> tests. I have just checked that for piglit, it is enough if we subtract 0.51.
> 
> Tapani, could you replace my -1.0 addition by -0.51 and see if that fixes
> the CTS test? In parallel, I'll check if this is also enough for dEQP.

dEQP seems to require -1.0 in any case, the tests only pass as long as we subtract in the range [0.96, 1.04]. Still, it is worth knowing if such a change would make the CTS test pass.

Comment 14 Tapani Pälli 2015-06-29 10:17:12 UTC

(In reply to Iago Toral from comment #13)
> (In reply to Iago Toral from comment #12)
> > I don't see anything special in the generated IR. After my patch we change
> > the way we compute the lod by subtracting 1 unit, but that is expected as
> > per the commit log. The test itself does not seem to do weird things either
> > after a quick glance at it.
> > 
> > I am wondering if the problem may spawn from rounding issues so that
> > depending on the specific test we are doing sometimes we fall on the wrong
> > side of the rounding. While I was working on deqp I noticed that the problem
> > was that we were computing a wrong value of LOD in all the scenarios and
> > verified that subtracting 1.0 from the computed LOD fixed the problem, but
> > as I explain in the commit log, I am not sure why this is needed. That
> > change seemed consistent with the way in which deqp computed the expected
> > result for those tests but maybe we do not need to subtract 1.0 to get the
> > lod we need, maybe the problem that deqp detected was a rounding issue and
> > by removing 1.0 we just fell on the right side of the rounding for those
> > tests. I have just checked that for piglit, it is enough if we subtract 0.51.
> > 
> > Tapani, could you replace my -1.0 addition by -0.51 and see if that fixes
> > the CTS test? In parallel, I'll check if this is also enough for dEQP.
> 
> dEQP seems to require -1.0 in any case, the tests only pass as long as we
> subtract in the range [0.96, 1.04]. Still, it is worth knowing if such a
> change would make the CTS test pass.

Did not, it seems to pass with -0.01 .. -0.50 but -0.51 makes it fail.

Comment 15 Iago Toral 2015-06-29 10:55:28 UTC

(In reply to Tapani Pälli from comment #14)
> (In reply to Iago Toral from comment #13)
> > (In reply to Iago Toral from comment #12)
> > > I don't see anything special in the generated IR. After my patch we change
> > > the way we compute the lod by subtracting 1 unit, but that is expected as
> > > per the commit log. The test itself does not seem to do weird things either
> > > after a quick glance at it.
> > > 
> > > I am wondering if the problem may spawn from rounding issues so that
> > > depending on the specific test we are doing sometimes we fall on the wrong
> > > side of the rounding. While I was working on deqp I noticed that the problem
> > > was that we were computing a wrong value of LOD in all the scenarios and
> > > verified that subtracting 1.0 from the computed LOD fixed the problem, but
> > > as I explain in the commit log, I am not sure why this is needed. That
> > > change seemed consistent with the way in which deqp computed the expected
> > > result for those tests but maybe we do not need to subtract 1.0 to get the
> > > lod we need, maybe the problem that deqp detected was a rounding issue and
> > > by removing 1.0 we just fell on the right side of the rounding for those
> > > tests. I have just checked that for piglit, it is enough if we subtract 0.51.
> > > 
> > > Tapani, could you replace my -1.0 addition by -0.51 and see if that fixes
> > > the CTS test? In parallel, I'll check if this is also enough for dEQP.
> > 
> > dEQP seems to require -1.0 in any case, the tests only pass as long as we
> > subtract in the range [0.96, 1.04]. Still, it is worth knowing if such a
> > change would make the CTS test pass.
> 
> Did not, it seems to pass with -0.01 .. -0.50 but -0.51 makes it fail.

I see, so CTE passes with -0.01..-0.50, piglit with -0.51..-1.49 and dEQP with -0.96..-1.04.

Seeing the ranges involved it seems as if both dEQP and piglit expect -1.0, only that piglit seems to be less strict with the requirements to accept the test as passed.

I am not sure about what to do about this. As things stand now it does not look like we can have both CTE and deqp/piglit pass for shadow cube samplers, at least definitely not by adjusting the computation of the lod value like this. 

If the CTE test expectation is correct, then I guess it means that the formulas we are using still need other tweaks for the case of (shadow) cube samplers or that we are not configuring the sampler message correctly in all situations. When the lowering pass kicks in, we use sampler_l_c messages, and there are some restrictions that need to be considered, and there is one mention specific to cube samplers: "The ai parameter indicates the array index for a cube surface" that I am not sure that we are honoring.

Comment 16 Tapani Pälli 2015-08-10 10:27:40 UTC

FYI this is the only remaining failure in our OpenGL ES 3.0 conformance.

Comment 17 Tapani Pälli 2015-08-13 07:07:02 UTC

I've run ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert conformance test and following DEQP test on Nvidia (346.59):

Nvidia passes the conformance test and passes 7/10 tests of:

dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*

the failing tests are:

dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_fragment
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercubeshadow_vertex
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercubeshadow_fragment

while i965 currently passes them all. Not sure what to make of this though. Will try to understand underlying problem more.

Comment 18 Tapani Pälli 2015-08-13 07:08:00 UTC

(In reply to Tapani Pälli from comment #17)
> while i965 currently passes them all. Not sure what to make of this though.
> Will try to understand underlying problem more.

by 'passes them all' I meant i965 passes all deqp tests but fails the conformance test.

Comment 19 Kevin Rogovin 2015-08-14 08:23:25 UTC

Hi,

 At Tapani's request, I've done some digging into this and the code to lower textureGrad for cube maps is much wronger than just adjusting rho by a constant.

Lets do the case where the face used is the Z_MAX face, i.e. r>0 and has largest absolute value. The then texel coordinates are:


 u = L * 0.5 * ( s/r + 1)
 v = L * 0.5 * ( t/r + 1) 

where L is the dimension of the cube-map. The gradient of u is then given by:

grad(u) = 0.5 * L * (  grad(s)/r -  (s/(r*r))*grad(r) )
 
then rho is given by:

rho = abs(grad(u)) + abs(grad(v))
    = 0.5 * L * ( |grad(s)/r -  (s/(r*r))*grad(r)| + |grad(t)/r -  (t/(r*r))*grad(r)| )

The current code has (after the -1 is taken into account on the log2)

rho = 0.5 * L * ( |grad s| + |grad t| + |grad r|)

which is only correct (in this case) when grad(r) is zero and r=1.


To fix the rho computation (or the LOD) requires first finding which face the thing is on, and from there develop grad(u), grad(v) correctly. There will be, sadly six cases. This can be reduces to 3 cases. Indeed, lets say the face was from Z-negative, then we have:

u = L * 0.5 * s/(-r)

thus grad(u) = - 0.5 * L * (  grad(s)/r -  (s/(r*r))*grad(r) )
      
which is the same as the Z_MAX but negated. The magnitude is unaffected though. Thus to compute the correct jazz, one has really just 3 cases: X-FACE, Y-FACE, Z-FACE.

Tapani told me this little nugget (found on the map page for texureGrad (https://www.opengl.org/sdk/docs/man/) :

"For the cube version, the partial derivatives of P are assumed to be in the coordinate system used before texture coordinates are projected onto the appropriate cube face."

That sentence confirms the requirement to use the calculus quotient rule.

-Kevin

Comment 20 Kevin Rogovin 2015-08-14 08:53:59 UTC

Correction

Currently for textureGrad, i965 does this (taking into account the -1):

rho = 0.5 * L * max( |grad s|, |grad t|, |grad r|)

where |grad A | is sqrt(dot(A,A)).

Sadly that could be optimized to skip the sqrt and do the sqrt "after" the log2 changing it into a multiply by 0.5.

Anyways it is utterly wrong for textureGrad on cubemaps.

-Kevin

Comment 21 Kevin Rogovin 2015-08-14 08:54:50 UTC

Ah....


|A| is sqrt(dot(A,A)) ... added grad by mistake in typing. Sighs.

Comment 22 Ian Romanick 2015-08-17 17:32:32 UTC

(In reply to Iago Toral from comment #9)
> Could it be that the CTS test is not correct?

The CTS defines what correct is.  As far as I'm aware, there are implementations that pass both the CTS and dEQP.

Comment 23 Kevin Rogovin 2015-08-26 07:14:32 UTC

Created attachment 117920 [details]
GLSL sample implementaton of compute LOD for textureGradient of cubemaps

Sample GLSL implementation of computing the LOD for textureGrad for cubemap textures. 

Ideally, converting the code to GLSL IR is a mechanical, though tedious, task.

Comment 24 Kevin Rogovin 2015-08-26 07:32:14 UTC

Created attachment 117921 [details]
LOD compute for textureCube on samplerCube

Slightly improved version.
   - remove unneeded computation
   - slight optimization additions
   - one more comment for readability
   - typo fix (forgo to put swizzle to vec2)
   - better name for a function

Comment 25 Tapani Pälli 2015-08-27 04:06:31 UTC

Nice work Kevin, I'll try mutilating it to IR form.

Comment 26 Tapani Pälli 2015-09-07 08:06:18 UTC

With a hack to allow textureLod with a samplerCubeShadow and custom textureGrad() injected I observed 3 failing tests with:

dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*

these are:

texturegrad.samplercube_fixed_vertex
texturegrad.samplercubeshadow_vertex
texturegrad.samplercubeshadow_fragment

other 7 are passing, I guess there might be something wrong with the formula (?)

Comment 27 Tapani Pälli 2015-09-07 08:26:24 UTC

(In reply to Tapani Pälli from comment #26)
> With a hack to allow textureLod with a samplerCubeShadow and custom
> textureGrad() injected I observed 3 failing tests with:
> 
> dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*
> 
> these are:
> 
> texturegrad.samplercube_fixed_vertex
> texturegrad.samplercubeshadow_vertex
> texturegrad.samplercubeshadow_fragment
> 
> other 7 are passing, I guess there might be something wrong with the formula
> (?)

Oops sorry, I did not realize I need to provide textureLod for vec3 and vec4 as well, now all those tests pass with Kevin's formula.

Comment 28 Tapani Pälli 2015-09-10 11:28:46 UTC

Updating status, I have a WIP branch here

http://cgit.freedesktop.org/~tpalli/mesa/log/?h=textureGrad

Comment 29 Tapani Pälli 2015-09-14 07:29:44 UTC

now passing all the tests, will clean up and send to mesa-dev mailing list

Comment 30 Tapani Pälli 2015-09-22 05:23:12 UTC

pushed fix to master

Comment 31 Matt Turner 2015-09-22 05:29:16 UTC

Please put the commit that fixed the bug into the bugzilla comment.

commit 7f8815bcb9af9b4b374ad7bd6e7cfa7529a6c980
Author: Tapani Pälli <tapani.palli@intel.com>
Date:   Mon Sep 7 15:08:13 2015 +0300

    i965: fix textureGrad for cubemaps

Comment 32 Tapani Pälli 2015-09-22 05:31:12 UTC

(In reply to Matt Turner from comment #31)
> Please put the commit that fixed the bug into the bugzilla comment.
> 
> commit 7f8815bcb9af9b4b374ad7bd6e7cfa7529a6c980
> Author: Tapani Pälli <tapani.palli@intel.com>
> Date:   Mon Sep 7 15:08:13 2015 +0300
> 
>     i965: fix textureGrad for cubemaps

I don't want to whine but why is this cross-referencing done, we already refer to bug in the git log, that should be enough?

Comment 33 Ian Romanick 2015-10-13 19:31:30 UTC

(In reply to Tapani Pälli from comment #32)
> (In reply to Matt Turner from comment #31)
> > Please put the commit that fixed the bug into the bugzilla comment.
> > 
> > commit 7f8815bcb9af9b4b374ad7bd6e7cfa7529a6c980
> > Author: Tapani Pälli <tapani.palli@intel.com>
> > Date:   Mon Sep 7 15:08:13 2015 +0300
> > 
> >     i965: fix textureGrad for cubemaps
> 
> I don't want to whine but why is this cross-referencing done, we already
> refer to bug in the git log, that should be enough?

If someone re-encounters the problem on a version that does not contain the fix, they can figure out that a fix exists.  It's helpful in cases where the bug exists in master and one or more stable branches, but the fix only lands in master without being marked for stable.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.