Bug 107212 - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics
Summary: Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted g...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Kenneth Graunke
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-12 22:40 UTC by Diego Viola
Modified: 2018-08-11 02:24 UTC (History)
1 user (show)

See Also:
i915 platform: G45
i915 features:


Attachments
Reicast with bad graphics on g45 (474.13 KB, image/png)
2018-07-12 22:40 UTC, Diego Viola
Details
lspci (1.40 KB, text/plain)
2018-07-12 22:41 UTC, Diego Viola
Details
glxinfo (18.57 KB, text/plain)
2018-07-16 17:31 UTC, Diego Viola
Details
Reicast BIOS artifacts 1 (16.55 KB, image/png)
2018-07-17 01:59 UTC, Diego Viola
Details
Reicast BIOS artifacts 2 (48.71 KB, image/png)
2018-07-17 01:59 UTC, Diego Viola
Details
git bisect log (2.50 KB, text/plain)
2018-07-17 23:57 UTC, Diego Viola
Details
apitrace (87.82 MB, application/x-xz)
2018-07-19 23:11 UTC, Diego Viola
Details
INTEL_DEBUG=fs apitrace output (112.99 KB, text/plain)
2018-07-20 00:20 UTC, Diego Viola
Details
Game rendering without gl_FragDepth and with the vpos.z hack (283.51 KB, image/png)
2018-07-22 18:11 UTC, Diego Viola
Details
Rendering without gl_FragDepth and with the vpos.z hack (437.87 KB, image/png)
2018-07-22 18:12 UTC, Diego Viola
Details
git bisect between Mesa 10.0 and 11.0.0 (2.81 KB, text/plain)
2018-07-28 08:13 UTC, Diego Viola
Details
bisect with the correct entry (2.83 KB, text/plain)
2018-07-28 08:49 UTC, Diego Viola
Details
RA + INTEL_DEBUG=no16 MESA_GLSL_CACHE_DISABLE=1 MESA_GLSL=dump (350.34 KB, text/plain)
2018-08-02 17:41 UTC, Diego Viola
Details

Description Diego Viola 2018-07-12 22:40:51 UTC
Created attachment 140615 [details]
Reicast with bad graphics on g45

Running RetroArch with the Reicast core (Sega Dreamcast emulator) ends up displaying corrupted graphics (see the attached screenshot).

I am running this emulator with its OpenGL 2 context, it also has a GL3 one. The reason I'm using GL2 over 3 is that my GPU does not support GL3.

My GPU:

[diego@dualcore ~]$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)
[diego@dualcore ~]$ 

I'm on Arch Linux, both RA and Reicast are compiled from git.

Thanks.
Comment 1 Diego Viola 2018-07-12 22:41:43 UTC
Created attachment 140616 [details]
lspci
Comment 2 Diego Viola 2018-07-12 22:44:21 UTC
I've tried reproducing this with the same setup on a T450 (broadwell graphics) and I cannot reproduce it there.

I've also used the same GL2 context on the T450.
Comment 3 Diego Viola 2018-07-12 23:17:45 UTC
Some details about my system:

xorg-server 1.20.0-9
mesa 18.1.3-1
linux 4.17.5-1

Using modesetting drivers.
Comment 4 Diego Viola 2018-07-14 00:22:53 UTC
I've tried compiling the latest mesa (current git master, commit 4ec8b39fcd8086ff73334dcb31491d907ac08e85) to see if the problem is still there, and I confirm I can still reproduce it.
Comment 5 Diego Viola 2018-07-16 16:21:37 UTC
Where I originally reported the problem: https://github.com/libretro/reicast-emulator/issues/153
Comment 6 Matt Turner 2018-07-16 16:59:42 UTC
Does the environment variable INTEL_DEBUG=no16 fix the corruption?
Comment 7 Diego Viola 2018-07-16 17:01:37 UTC
(In reply to Matt Turner from comment #6)
> Does the environment variable INTEL_DEBUG=no16 fix the corruption?

No, it doesn't.
Comment 8 Diego Viola 2018-07-16 17:31:19 UTC
Created attachment 140651 [details]
glxinfo
Comment 9 Diego Viola 2018-07-16 20:52:07 UTC
Please let me know if I can provide more information.
Comment 10 Diego Viola 2018-07-17 01:59:01 UTC
Created attachment 140655 [details]
Reicast BIOS artifacts 1

Please note there is also artifacts/corruption at the BIOS level, it's not just in the game.
Comment 11 Diego Viola 2018-07-17 01:59:20 UTC
Created attachment 140656 [details]
Reicast BIOS artifacts 2
Comment 12 Diego Viola 2018-07-17 19:09:29 UTC
I've compiled mesa-17.0.0 from git and I get good graphics (no corruption) when starting the game with INTEL_DEBUG=no16.

Without this env var I still get graphics corruption.

I suppose I can bisect from now.
Comment 13 Diego Viola 2018-07-17 19:56:43 UTC
The mesa-18.0.0 tag is also good with INTEL_DEBUG=no16.
Comment 14 Diego Viola 2018-07-17 20:20:05 UTC
mesa-18.1.0 is also good with INTEL_DEBUG=no16.
Comment 15 Diego Viola 2018-07-17 21:43:06 UTC
mesa-18.1.4 is good with INTEL_DEBUG=no16.
Comment 16 Diego Viola 2018-07-17 22:07:08 UTC
With the latest master the graphics look alright, the Dreamcast BIOS still look a bit distorted, but the in-game graphics look fine.
Comment 17 Diego Viola 2018-07-17 22:38:00 UTC
(In reply to Diego Viola from comment #7)
> (In reply to Matt Turner from comment #6)
> > Does the environment variable INTEL_DEBUG=no16 fix the corruption?
> 
> No, it doesn't.

Actually, I was wrong about that, it does help but there are a few revisions which the environment variable helps but there are others which it doesn't seem to make any effect.
Comment 18 Diego Viola 2018-07-17 23:55:58 UTC
[diego@dualcore mesa]$ git bisect good
bb5449cfeee1f6ec151343e673f29a3f69b750b2 is the first bad commit
commit bb5449cfeee1f6ec151343e673f29a3f69b750b2
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Tue Jul 17 14:51:16 2018 -0400

    r600: fix warnings when unref'ing pool->bo

:040000 040000 a0c76070724c635f14cfb69dd2dea48c0c6b45e0 128be7a40b56a69b39c9f259b3ed8c62f1997bdb M	src
[diego@dualcore mesa]$
Comment 19 Diego Viola 2018-07-17 23:57:03 UTC
Created attachment 140679 [details]
git bisect log
Comment 20 Diego Viola 2018-07-18 00:24:48 UTC
Bad bisect, I'll do it again.
Comment 21 Diego Viola 2018-07-18 01:31:22 UTC
For some reason I'm able to reproduce the graphics corruption with the archlinux mesa package (mesa 18.1.4-1), but when I compile the mesa-18.1.4 tag from git, I'm not able to reproduce the issue.
Comment 22 Diego Viola 2018-07-18 02:44:10 UTC
(In reply to Matt Turner from comment #6)
> Does the environment variable INTEL_DEBUG=no16 fix the corruption?

With the latest mesa code from master it works.
Comment 23 Diego Viola 2018-07-18 02:58:23 UTC
For some reason, current HEAD is broken for me again.
Comment 24 Diego Viola 2018-07-18 21:23:52 UTC
Any ideas about what could be the problem? I've ran out of ideas.
Comment 25 Diego Viola 2018-07-19 23:11:16 UTC
Created attachment 140720 [details]
apitrace

This apitrace helps reproducing the problem.

Please note that replaying this apitrace requires utilizing apitrace from git because of a bug that was fixed recently.

https://github.com/apitrace/apitrace/issues/537
Comment 26 Diego Viola 2018-07-20 00:20:55 UTC
Created attachment 140722 [details]
INTEL_DEBUG=fs apitrace output
Comment 27 Diego Viola 2018-07-22 04:56:01 UTC
Hi, after some investigation we have concluded that removing these two gl_FragDepth lines:

https://github.com/libretro/reicast-emulator/blob/2d411a166d8fa4a4cc5f0cb23c3f60537774333d/core/rend/gles/gles.cpp#L271

and:

https://github.com/libretro/reicast-emulator/blob/2d411a166d8fa4a4cc5f0cb23c3f60537774333d/core/rend/gles/gles.cpp#L291

And replacing this line:

https://github.com/libretro/reicast-emulator/blob/2d411a166d8fa4a4cc5f0cb23c3f60537774333d/core/rend/gles/gles.cpp#L107

    vpos.z = vpos.w; \n"

with:

    vpos.z = (1 - 0.002*vpos.w) * vpos.w; \n"

Gives me proper in-game graphics.
Comment 28 Diego Viola 2018-07-22 18:11:46 UTC
Created attachment 140771 [details]
Game rendering without gl_FragDepth and with the vpos.z hack

"Gives me proper in-game graphics."

Well, not quite. The game still has some artifacts like the sky being rendered as black.

But at least some of the corruption is gone.
Comment 29 Diego Viola 2018-07-22 18:12:52 UTC
Created attachment 140772 [details]
Rendering without gl_FragDepth and with the vpos.z hack
Comment 30 Diego Viola 2018-07-28 06:39:17 UTC
I tried almost all the mesa versions, from the current git (HEAD), to the current version: 18.1.4, mesa-17.0.0, mesa-13.0.0, mesa-12.0.0, mesa-11.0.0. And found that all of them are broken with gl_FragDepth.

The only one that is giving me good graphics is mesa-10.0. So I will bisect soon.

mesa-9.0 I couldn't compile anymore due to compilation errors, even after trying to downgrade GCC to gcc54.
Comment 31 Diego Viola 2018-07-28 08:09:57 UTC
[diego@dualcore mesa]$ git bisect good
797d606127c131a6ccff28150495d2b1f3f7e46e is the first bad commit
commit 797d606127c131a6ccff28150495d2b1f3f7e46e
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Feb 20 15:11:49 2015 -0800

    i965: Implement SIMD16 texturing on Gen4.
    
    This allows SIMD16 mode to work for a lot more programs.  Texturing is
    also more efficient in SIMD16 mode than SIMD8.  Several messages don't
    actually exist in SIMD8 mode, so we did SIMD16 messages and threw away
    half of the data.  Now we compute real data in both halves.
    
    Also, the SIMD16 "sample" message doesn't require all three coordinate
    components to exist (like the SIMD8 one), so we can shorten the message
    lengths, cutting register usage a bit.
    
    I chose to implement the visitor functionality in a separate function,
    since mixing true SIMD16 with SIMD8 code that uses SIMD16 fallbacks
    seemed like a mess.  The new code bails on a few cases where we'd
    have to do two SIMD8 messages - we just fall back to SIMD8 for now.
    
    Improves performance in "Shadowrun: Dragonfall - Director's Cut" by
    about 20% on GM45 (measured with LIBGL_SHOW_FPS=1 while standing around
    in the first mission).
    
    v2: Add ir_txf to the has_lod case (caught by Jordan Justen).
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

:040000 040000 45a6b6fb2fd15aab6b877277ecee33b41cad2ef4 325dfafa624c61855d7dc9e7ed94e540e991b28e M	src
[diego@dualcore mesa]$
Comment 32 Diego Viola 2018-07-28 08:13:17 UTC
Created attachment 140863 [details]
git bisect between Mesa 10.0 and 11.0.0

The skipped revisions were skipped because of compilation time errors, etc.

No INTEL_DEBUG=no16 was used when testing/bisecting.
Comment 33 Diego Viola 2018-07-28 08:30:51 UTC
I can confirm 797d606127c131a6ccff28150495d2b1f3f7e46e is bad.
I do still see *some* minor corruption with 8aee87fe4cce0a883867df3546db0e0a36908086
This is a good commit: 108b92b1e9f645e9d2ff33b24648f5d089cb89c9
Comment 34 Diego Viola 2018-07-28 08:31:51 UTC
So the problem actually started with: 8aee87fe4cce0a883867df3546db0e0a36908086 and got worse with: 797d606127c131a6ccff28150495d2b1f3f7e46e
Comment 35 Diego Viola 2018-07-28 08:33:31 UTC
# good: [8aee87fe4cce0a883867df3546db0e0a36908086] i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.

I should have marked this as bad, that was a mistake.
Comment 36 Diego Viola 2018-07-28 08:48:42 UTC
[diego@dualcore mesa]$ git bisect bad
8aee87fe4cce0a883867df3546db0e0a36908086 is the first bad commit
commit 8aee87fe4cce0a883867df3546db0e0a36908086
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Feb 20 14:09:17 2015 -0800

    i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.
    
    Gen5+ systems allow you to specify multiple shader programs - both SIMD8
    and SIMD16 - and the hardware will automatically dispatch to the most
    appropriate one, given the number of subspans to be processed.
    
    However, that is not the case on Gen4.  Instead, you program a single
    shader.  If you enable multiple dispatch modes (SIMD8 and SIMD16), the
    shader is supposed to contain a series of jump instructions at the
    beginning.  The hardware will launch the shader at a small offset,
    hitting one of the jumps.
    
    We've always thought that sounds like a pain, and weren't clear how it
    affected performance - is it worth having multiple shader types?  So,
    we never bothered with SIMD16 until now.
    
    This patch takes a simpler approach: try and compile a SIMD16 shader.
    If possible, set the no_8 flag, telling the hardware to just use the
    SIMD16 variant all the time.
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

:040000 040000 a0bac15a1f0b79b629c912cdcee2265f8a968d81 45a6b6fb2fd15aab6b877277ecee33b41cad2ef4 M	src
[diego@dualcore mesa]$
Comment 37 Diego Viola 2018-07-28 08:49:18 UTC
Created attachment 140864 [details]
bisect with the correct entry
Comment 38 Diego Viola 2018-08-01 19:59:34 UTC
The workaround INTEL_DEBUG=no16 and MESA_GLSL_CACHE_DISABLE=1 works fine.
Comment 39 Diego Viola 2018-08-02 17:41:10 UTC
Created attachment 140940 [details]
RA + INTEL_DEBUG=no16 MESA_GLSL_CACHE_DISABLE=1 MESA_GLSL=dump
Comment 40 Diego Viola 2018-08-10 20:01:18 UTC
Fixed with the latest mesa git master.

https://gitlab.freedesktop.org/mesa/mesa/commit/08a5c395abdafd0d7556060596f78c238b4a989f

Kenneth, thank you so much for your help, I appreciate it a lot.
Comment 41 Diego Viola 2018-08-11 02:16:30 UTC
commit 08a5c395abdafd0d7556060596f78c238b4a989f
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Thu Aug 2 15:02:18 2018 -0700

    intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
    
    When the SIMD16 Gen4-5 fragment shader payload contains source depth
    (g2-3), destination stencil (g4), and destination depth (g5-6), the
    single register of stencil makes the destination depth unaligned.
    
    We were generating this instruction in the RT write payload setup:
    
       mov(16)   m14<1>F   g5<8,8,1>F   { align1 compr };
    
    which is illegal, instructions with a source region spanning more than
    one register need to be aligned to even registers.  This is because the
    hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
    compressed instruction into two mov(8)'s.
    
    I believe this would cause the hardware to load g5 twice, replicating
    subspan 0-1's destination depth to subspan 2-3.  This showed up as 2x2
    artifact blocks in both TIS-100 and Reicast.
    
    Normally, we rely on the register allocator to even-align our virtual
    GRFs.  But we don't control the payload, so we need to lower SIMD widths
    to make it work.  To fix this, we teach lower_simd_width about the
    restriction, and then call it again after lower_load_payload (which is
    what generates the offending MOV).
    
    Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
    Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
    Tested-by: Diego Viola <diego.viola@gmail.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.