Bug 13728

Summary: [G965] Some objects in Neverwinter Nights Linux version not displayed correctly
Product: Mesa Reporter: Miroslav Machala <machala.m>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: lowest CC: apinheiro, cloessl+freedesktop, drmccoy, k00_fol, vasyl.demin
Version: gitKeywords: bisected, regression
Hardware: Other   
OS: Linux (All)   
Whiteboard: Workaround: INTEL_DEBUG=no16
i915 platform: i915 features:
Attachments: Bug showing in Neverwinter Nights
Bug showing in TIS-100
Wireframe of the cupboard model in NWN
Bug showing on a chest in Neverwinter Nights

Description Miroslav Machala 2007-12-18 15:31:44 UTC
Overall NWN seems to be running OK, however many objects including characters look like dithering, partially transparent mix of pixels. Such objects include chests, tables, doors and all characters. When "environmental mapping on creatures" feature is turned on, it resolves the issue for characters, however the rest remains the same.
Comment 1 Eric Anholt 2007-12-18 15:52:04 UTC
I think we've got some glPolygonOffset issues, which have caused troubles for NWN on other drivers in the past.
Comment 2 Eric Anholt 2008-02-04 19:33:49 UTC
After fixing a glPolygonOffset bug trying to fix this, I don't think that's the issue any more.
Comment 3 Joel 2008-03-19 20:16:53 UTC
I can't help to notice this almost* exclusively happens for objects that would be in glowing mist when highlighted. Could this help to narrow the issue down?
(uneducated guess: shader related?)

*the exception being the targetting cross placed on ground when walking using the mouse

Comment 4 Eric Anholt 2008-03-20 16:13:12 UTC
I'd noticed the same thing, but in scanning through the shaders used by nwn (strings on the binary will get you a collection) I didn't notice any that looked special.
Comment 5 Joel 2008-03-23 11:30:51 UTC
I tried to modify those strings, but it didn't appear to change anything. Either the change is too subtle to be noticed by me, or they are not really used.

I was testing stuff in mesa/progs and came across 'mesa/progs/glsl/noise'
I'm not sure how it's supposed to be, but the same pattern can be seen on doors in neverwinter. It prints the following to stdout too:

Shader compiled OK
Shader compiled OK
Link success!
Uniform Scale location: 4
Uniform Bias location: 5
Uniform Slice location: 6
GL_RENDERER = Mesa DRI Intel(R) 965G 20061102
unsupport opcode 46 in fragment program
unsupport opcode 46 in fragment program
unsupport opcode 46 in fragment program
unsupport opcode 46 in fragment program
Comment 6 Gordon Jin 2008-03-26 19:40:40 UTC
(In reply to comment #5)
> I was testing stuff in mesa/progs and came across 'mesa/progs/glsl/noise'
> I'm not sure how it's supposed to be, but the same pattern can be seen on doors
> in neverwinter. It prints the following to stdout too:
> 
> Shader compiled OK
> Shader compiled OK
> Link success!
> Uniform Scale location: 4
> Uniform Bias location: 5
> Uniform Slice location: 6
> GL_RENDERER = Mesa DRI Intel(R) 965G 20061102
> unsupport opcode 46 in fragment program
> unsupport opcode 46 in fragment program
> unsupport opcode 46 in fragment program
> unsupport opcode 46 in fragment program
> 

This is reported at bug#15217
Comment 7 Joel 2008-11-08 21:18:42 UTC
I have tested this against Mesa 7.2 using both version 2.4.2 and 2.5.0(GEM) of xf86-video-intel. 

While there are still objects that are not displayed correctly, the "Swarm of pixels" issue is not there anymore as far as I have seen. I have checked some doors, chest and creatures that previously did not render good. They now look allright.

(Those objects that do not display right are some heads that are not rendered at all, and the highlighted chat is appears to be painted in some shroedingers color, but that is for another bug)
Comment 8 Joel 2008-11-21 08:38:03 UTC
Clarifying my last comment: Cannot reproduce, with recent drivers. 
(All the other stuff I mentioned belong in another bug report)
Comment 9 Eric Anholt 2008-12-15 09:37:51 UTC
Original issue is fixed per comment #7.
Comment 10 Adam Jackson 2009-08-24 12:28:46 UTC
Mass version move, cvs -> git
Comment 11 Sven Hesse 2015-12-26 00:59:39 UTC
This bug seems to be back, at least on my Arch Linux laptop with extra/mesa-libgl 11.1.0-1 and extra/xf86-video-intel 1:2.99.917+519+g8229390-1.

I have the swarm of transparent pixels on nearly all objects, like chests, doors and tables in Neverwinter Nights.

Moreover, this bug also occurs on the programming game TIS-100, which uses Unity3D. The effect is transparent/black pixels all over the text characters, making them basically unreadable.
Comment 12 Sven Hesse 2015-12-26 01:13:06 UTC
Created attachment 120688 [details]
Bug showing in Neverwinter Nights
Comment 13 Sven Hesse 2015-12-26 01:14:38 UTC
Created attachment 120689 [details]
Bug showing in TIS-100
Comment 14 Alejandro Piñeiro (freenode IRC: apinheiro) 2016-02-16 19:31:34 UTC
(In reply to Sven Hesse from comment #11)
> This bug seems to be back, at least on my Arch Linux laptop with
> extra/mesa-libgl 11.1.0-1 and extra/xf86-video-intel
> 1:2.99.917+519+g8229390-1.

Taking into account that the original bug was created several years ago, when the hw specified was the norm, could you confirm that you are using the same hw (965GM)?
 
> I have the swarm of transparent pixels on nearly all objects, like chests,
> doors and tables in Neverwinter Nights.

FWIW, I tested Neverwinter Nights using both master and the 11-1 branch, on a Haswell machine, and no artifacts on that case.
Comment 15 Sven Hesse 2016-02-16 20:05:57 UTC
> could you confirm that you are using the same hw (965GM)

Yes:

00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (secondary) (rev 0c)

The CPU is an Intel Core 2 Duo T7100, the laptop a Dell XPS M1330, model number PP25L.

Arch is now at mesa-libgl 11.1.2-1, and the issue is still present there.
Comment 16 Sven Hesse 2016-04-23 22:02:20 UTC
Okay, I managed to git bisect the issue.

The first bad commit is 797d606127c131a6ccff28150495d2b1f3f7e46
i965: Implement SIMD16 texturing on Gen4.
Comment 17 Sven Hesse 2016-04-24 19:06:54 UTC
Created attachment 123215 [details]
Wireframe of the cupboard model in NWN

Here's the wireframe of the cupboard model in NWN, the one that shows the transparent pixels ingame on i965.

Note that the transparent pixels seem to be near the polygon boundaries.
Comment 18 Sven Hesse 2016-05-15 14:16:52 UTC
Created attachment 123758 [details]
Bug showing on a chest in Neverwinter Nights

Added a screenshot of the bug showing on a chest placeable in Neverwinter Nights.

An API trace of the bug showing on that same chest is here: https://drmccoy.de/zeugs/nwmain.trace (58MB, so over the attachment file size limit, unfortunately).
Comment 19 Vasyl Demin 2016-10-31 20:03:49 UTC
Sven, great work!

I opened another bug report for this issue several years ago:
https://bugs.freedesktop.org/show_bug.cgi?id=59930

Feel free to close it as duplicate.

BTW, I can't even run the game on mesa 13.0.0rc2. On older versions (8.0.4-12.0.3) NWN worked but had this bug.
Comment 20 Sven Hesse 2016-10-31 20:41:37 UTC
> I opened another bug report for this issue several years ago:
> https://bugs.freedesktop.org/show_bug.cgi?id=59930

Oh, I hadn't seen that. My bad.

> BTW, I can't even run the game on mesa 13.0.0rc2.

Okay, I can *not* confirm that. On my Arch laptop with extra/mesa-libgl 13.0.0rc2-2 and multilib/libg32-mesa-libgl 13.0.0rc2-1, I can still run Neverwinter Nights just fine. The bug is also still there, though.
Comment 21 Matt Turner 2016-11-03 02:52:14 UTC
*** Bug 59930 has been marked as a duplicate of this bug. ***
Comment 22 Matt Turner 2016-11-03 02:54:01 UTC
(In reply to Sven Hesse from comment #16)
> Okay, I managed to git bisect the issue.
> 
> The first bad commit is 797d606127c131a6ccff28150495d2b1f3f7e46
> i965: Implement SIMD16 texturing on Gen4.

To confirm, everything seemed to be working well before that commit?
Comment 23 Sven Hesse 2016-11-03 02:59:57 UTC
> To confirm, everything seemed to be working well before that commit?

Yes, exactly. Everything seemed to be working well before that commit and the texture glitch appeared with that commit.
Comment 24 Elizabeth 2017-10-26 22:36:49 UTC
Hello Sven, this case is quite old, is this still valid with latest Mesa and gfx driver? Thank you.
Comment 25 Vasyl Demin 2017-10-27 13:04:35 UTC
The issue still persists with mesa 17.2.3 and xf86-video-intel 2.99.917+796+g04b4f3b7.
Comment 26 Tapani Pälli 2017-10-31 06:18:19 UTC
not sure if anyone will work on this but you could try using environment variable INTEL_DEBUG=no16 to workaround this bug
Comment 27 Vasyl Demin 2017-11-01 21:13:51 UTC
Thanks, Tapani! INTEL_DEBUG=no16 works, I can't see artifacts any more.
Comment 28 Kenneth Graunke 2018-01-11 08:23:52 UTC
Today I spent some time looking at the TIS-100 bug, because it's a lot simpler to work with and possibly related.  I discovered that it's a shader with a discard that isn't working in SIMD16 mode.

The discard handling itself doesn't appear to be the problem.  I can disable all assembly code that even touches the pixel mask, and it still has the issue.

The render target message length appears to be 15, which is near the limit.

Specifically, these WM_IZ features are in play:

- Discard (kill pixels)
- Depth testing
- Stencil testing and stencil writes
- No depth writes
- No computed depth

Changing any of these appears to fix the issue:

- Eliminating discards at the NIR level (removing the WM_IZ bits and WM_STATE bits for killpix)
- Disabling stencil testing (brw->stencil_enabled = false, disabling the WM_IZ bits for stencil)
- Disabling depth testing

I'm pretty sure we're staring right at the problem, but I'm not seeing it yet.
Comment 29 Kenneth Graunke 2018-08-11 00:57:53 UTC
Hi Sven, Vasyl,

TIS-100 and other Gen4-5 misrendering with SIMD16 should be fixed with the following commit on master.  Sorry this took so horribly long. :(

commit 08a5c395abdafd0d7556060596f78c238b4a989f
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Thu Aug 2 15:02:18 2018 -0700

    intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
    
    When the SIMD16 Gen4-5 fragment shader payload contains source depth
    (g2-3), destination stencil (g4), and destination depth (g5-6), the
    single register of stencil makes the destination depth unaligned.
    
    We were generating this instruction in the RT write payload setup:
    
       mov(16)   m14<1>F   g5<8,8,1>F   { align1 compr };
    
    which is illegal, instructions with a source region spanning more than
    one register need to be aligned to even registers.  This is because the
    hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
    compressed instruction into two mov(8)'s.
    
    I believe this would cause the hardware to load g5 twice, replicating
    subspan 0-1's destination depth to subspan 2-3.  This showed up as 2x2
    artifact blocks in both TIS-100 and Reicast.
    
    Normally, we rely on the register allocator to even-align our virtual
    GRFs.  But we don't control the payload, so we need to lower SIMD widths
    to make it work.  To fix this, we teach lower_simd_width about the
    restriction, and then call it again after lower_load_payload (which is
    what generates the offending MOV).
    
    Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
    Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
    Tested-by: Diego Viola <diego.viola@gmail.com>
Comment 30 Sven Hesse 2018-08-11 11:03:06 UTC
Hi Kenneth,

I can confirm that this fixes the bug in both Neverwinter Nights and TIS-100 for me. Thank you so much for taking the time to look into this! :)
Comment 31 Vasyl Demin 2018-08-16 21:47:44 UTC
Hi, Kenneth,

I can confirm the fix too. Thank you very much for solving this long-standing bug!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.