Bug 103047 - lighting render issue with i965 on SKL+
Summary: lighting render issue with i965 on SKL+
Status: REOPENED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 18.0
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Kenneth Graunke
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-30 19:14 UTC by Steven Noonan
Modified: 2018-10-24 17:43 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Noonan 2017-09-30 19:14:34 UTC

    
Comment 1 Steven Noonan 2017-09-30 19:16:44 UTC
Whoops, fat-fingered submit button before I filled this in. Sorry.

Mesa version is 17.2.1

If I play this trace on an i915 Skylake system, the lighting on the landscape is flickery and weird when the camera moves. If I export LIBGL_ALWAYS_SOFTWARE=1 and play the trace, everything renders as it should.

Here's the apitrace output:

https://www.uplinklabs.net/files/darwinia-trace.log.gz
Comment 2 Steven Noonan 2017-09-30 19:29:41 UTC
I've also uploaded apitrace-generated images from the two replays:

https://www.uplinklabs.net/files/mesa-pr103047/

The "good" folder is rendered with LIBGL_ALWAYS_SOFTWARE=1, and the "bad" folder is rendered without. And of course the "diff" folder is from "apitrace diff-images"

A few examples:

https://www.uplinklabs.net/files/mesa-pr103047/diff/0005113532.diff.png
https://www.uplinklabs.net/files/mesa-pr103047/diff/0005136883.diff.png
https://www.uplinklabs.net/files/mesa-pr103047/diff/0005159887.diff.png
https://www.uplinklabs.net/files/mesa-pr103047/diff/0005182641.diff.png
https://www.uplinklabs.net/files/mesa-pr103047/diff/0005205499.diff.png
Comment 3 Steven Noonan 2018-05-28 10:45:27 UTC
Still running into this on another machine with Kaby Lake GT2 graphics.

Can anyone take a look please?
Comment 4 Steven Noonan 2018-05-28 11:01:02 UTC
Bumping version as well, since this is hitting me on 18.0.4 and 18.1.0.
Comment 5 Steven Noonan 2018-05-28 11:18:57 UTC
Oh, I had this tagged against i915 which doesn't make sense. Switching to i965.
Comment 6 Tapani Pälli 2018-05-29 07:47:09 UTC
Seems like uplinklabs.net uses invalid cert so cannot access the site.
Comment 7 Tapani Pälli 2018-05-29 08:22:53 UTC
Darwinia demo from here works fine for me on KBL system:

https://www.introversion.co.uk/darwinia/downloads/demo_linux.html
Comment 8 Tapani Pälli 2018-05-29 08:24:04 UTC
(In reply to Tapani Pälli from comment #7)
> Darwinia demo from here works fine for me on KBL system:
> 
> https://www.introversion.co.uk/darwinia/downloads/demo_linux.html

This was tested with Mesa oibaf drivers (somewhere within 18.1.x) on Ubuntu 18.04.
Comment 9 Steven Noonan 2018-05-29 09:05:45 UTC
The demo repros the issue for me. You have to fly around a bit and most importantly tilt the camera around in order to observe the broken behavior. Long strips of landscape will flicker with different lighting.

I could record another apitrace, but that'd be redundant, the one I posted earlier shows the issue nicely.
Comment 10 Tapani Pälli 2018-05-29 09:57:49 UTC
(In reply to Steven Noonan from comment #9)
> The demo repros the issue for me. You have to fly around a bit and most
> importantly tilt the camera around in order to observe the broken behavior.
> Long strips of landscape will flicker with different lighting.
> 
> I could record another apitrace, but that'd be redundant, the one I posted
> earlier shows the issue nicely.

Yep, now I see it. I can reproduce it when rotating the camera to clockwise direction.
Comment 11 Tapani Pälli 2018-05-29 10:24:51 UTC
Game seems to sometimes call glTexEnv with invalid arguments. Not sure if this is related to the rendering errors though as these errors happen also on Haswell but rendering looks OK there.
Comment 12 Steven Noonan 2018-05-29 10:29:46 UTC
Here are all the glTexEnv* calls in the entire code base:

$ git grep glTexEnv | grep -v -e ^docs -e directx
code/explosion.cpp:             glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL);
code/explosion.cpp:             glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/loaders/amiga_loader.cpp:  glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
code/loaders/amiga_loader.cpp:  glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/renderer.cpp:      glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/renderer.cpp:      glTexEnviv(GL_TEXTURE_ENV, GL_TEXTURE_ENV_COLOR, (GLint *)colour);
code/taskmanager_interface_gestures.cpp:        glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/taskmanager_interface_gestures.cpp:        glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
code/taskmanager_interface_gestures.cpp:        glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_RGB_EXT);
code/taskmanager_interface_gestures.cpp:        glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_MODULATE);
code/taskmanager_interface_gestures.cpp:        glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/water.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/water.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
code/water.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/water.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_MODULATE);
code/water.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/worldobject/feedingtube.cpp:       glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/feedingtube.cpp:       glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
code/worldobject/feedingtube.cpp:       glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/feedingtube.cpp:       glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_MODULATE);
code/worldobject/feedingtube.cpp:       glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/worldobject/laserfence.cpp:                                glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/laserfence.cpp:                                glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
code/worldobject/laserfence.cpp:                                glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/laserfence.cpp:                                glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_MODULATE);
code/worldobject/laserfence.cpp:                                glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
code/worldobject/radardish.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/radardish.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
code/worldobject/radardish.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
code/worldobject/radardish.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_MODULATE);
code/worldobject/radardish.cpp: glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);


But yeah, even if there are bogus glTexEnv calls, the rendering difference between HSW/SKL is the more interesting problem I think.
Comment 13 Jason Ekstrand 2018-06-07 00:50:48 UTC
I've also confirmed it works on Broadwell.  I have no idea what is different between Broadwell and Sky Lake that would cause the failure.  It looks like something is going wrong with the normals in the vertex shader or something like that.
Comment 14 Danylo 2018-06-13 10:57:57 UTC
I found that the underlying cause of the issue lies in the combination of flat shading, provoking vertex and clipped polygons.

Reproducing the issue is easy:

- Have glShadeModel set to GL_FLAT, smooth doesn't have such issue
- Have glProvokingVertexEXT set to GL_LAST_VERTEX_CONVENTION_EXT, first vertex mode seems not to exhibit the issue
- Create triangle strip with 3 vertices out of screen with blue color, two vertices on screen where the last one has red color (colors themself don’t 
- Expected color of the triangle is red because color should be taken from last vertex but on Intel GPU triangle is blue

I asked Jason to help me, I'll quote his answer:
> I think what you've found is a hardware bug. :-(  
> It appears that the clipper is messing up provoking vertex whenever entire 
> polygons are clipped.

> What do we do about it?  That's an interesting question.  
> The most obvious thing that jumps out to me would be to do a shader 
> workaround where we tell the hardware that flat inputs are not flat 
> (so it passes in all three vertices) and then manually grab the correct 
> one of the three interpolants based on which provoking vertex is set.  
> I don't think provoking vertex is something that applications change 
> frequently so having a bit in the FS shader key probably isn't too 
> bad for this.

I've sent a patch to Piglit which reproduces the bug - https://patchwork.freedesktop.org/patch/229260/
Comment 15 Steven Noonan 2018-06-13 14:14:04 UTC
Wow, thanks for digging into this!

I can confirm that using glShadingModel(GL_SMOOTH) *or* glProvokingVertexEXT(GL_FIRST_VERTEX_CONVENTION_EXT) both eliminate the issue in the Darwinia landscape/water renderers. The glProvokingVertexEXT change looks better for Darwinia's case because it doesn't appear to negatively impact the intended appearance.
Comment 16 Kenneth Graunke 2018-06-16 05:13:39 UTC
Hi Steven,

It turns out that this is a known hardware issue with provoking vertices not working right in some circumstances.  The good news is that there's a simple workaround.  The bad news is that it's going to take a kernel patch to fix it. :(

Patch from Chris Wilson and I:
https://lists.freedesktop.org/archives/intel-gfx/2018-June/168389.html

In the meantime, if you have intel-gpu-tools installed, you may be able to work around the issue by running these commands:

$ sudo intel_reg write 0x2090 0x10001000
$ sudo intel_reg write 0x2088 0x20002

(That should fix the issue, but I'm not clear whether the values will stick or if they'll get reset when the GPU goes into a low power state...)

Thank you for the report!  And huge thanks to Danil for tracking this down to a provoking vertex problem.
Comment 17 Chris Wilson 2018-06-18 09:13:31 UTC
commit b77422f80337d363eed60c8c48db9cb6e33085c9
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Jun 15 20:06:05 2018 +0100

    drm/i915: Enable provoking vertex fix on Gen9 systems.
    
    The SF and clipper units mishandle the provoking vertex in some cases,
    which can cause misrendering with shaders that use flat shaded inputs.
    
    There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN
    (for the clipper) that work around the issue.  These registers are
    unfortunately not part of the logical context (even the power context),
    and so we must reload them every time we start executing in a context.
    
    Bugzilla: https://bugs.freedesktop.org/103047
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180615190605.16238-1-chris@chris-wilson.co.uk
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: stable@vger.kernel.org

will land first in v4.19 / v4.18 and we'll backport as far as people ask.
Comment 18 Denis 2018-07-31 11:02:07 UTC
Hello guys. As we didn't have update here for a while, I tested the fix, provided by by Kenneth with a test, provided by Danylo. I can say, that test passes in drm-intel-next (commit exists there), and fails in my ubuntu kernel 4.15.
upd - test is still not added to master, so I applied it as a patch.

Results below:

>Linux and-Vostro-15-3568 4.17.0-rc3+
"spec@arb_provoking_vertex@arb-provoking-vertex-clipped-geometry-flatshading":{
"__type__":"TestResult",
"command":"/usr/local/lib/piglit/bin/arb-provoking-vertex-clipped-geometry-flatshading -auto",
"environment":"PIGLIT_SOURCE_DIR=\"/usr/local/lib/piglit\" PIGLIT_PLATFORM=\"mixed_glx_egl\"",
"err":"",
"out":"",
"result":"pass",
"returncode":0,
"subtests":{
"__type__":"Subtests"
},
"time":{
"start":1533032151.0282168,
"end":1533032151.1608405,
"__type__":"TimeAttribute"
},
"exception":null,
"traceback":null,
"dmesg":"",
"pid":[
29258
__________________________________________________________________________
>Linux and-Vostro-15-3568 4.15.0-29-generic
"spec@arb_provoking_vertex@arb-provoking-vertex-clipped-geometry-flatshading":{
"__type__":"TestResult",
"command":"/usr/local/lib/piglit/bin/arb-provoking-vertex-clipped-geometry-flatshading -auto",
"environment":"PIGLIT_SOURCE_DIR=\"/usr/local/lib/piglit\" PIGLIT_PLATFORM=\"mixed_glx_egl\"",
"err":"",
"out":"Probe color at (158,79)\n Expected: 1.000000 0.000000 0.000000\n Observed: 0.000000 1.000000 0.000000\n",
"result":"fail",
"returncode":1,
"subtests":{
"__type__":"Subtests"
},
"time":{
"start":1533032579.1013076,
"end":1533032579.2158325,
"__type__":"TimeAttribute"
},
"exception":null,
"traceback":null,
"dmesg":"",
"pid":[
2069
]
Comment 19 Mark Janes 2018-07-31 13:17:25 UTC
I pushed the piglit test that provokes this issue.  Given the resolution, should we re-assign this bug to DRI and mark it resolved?
Comment 20 Dylan Baker 2018-08-15 21:45:51 UTC
I'm marking this fixed.
Comment 21 Mark Janes 2018-10-24 15:46:25 UTC
Mesa's i965 CI updated kernels to get this fix.  We found that it only works on a subset of SKL hardware:

fails:  HD520 (gt2)
passes: HD530 (gt2), HD580 (gt4e)

Apparently a hardware bug was not fixed until a very late stepping.
Comment 22 Kenneth Graunke 2018-10-24 17:43:54 UTC
We can do more digging to try and figure out if there's a better workaround, but I'm afraid that we're going to have to hack around this in shaders, at a performance penalty...


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.