Bug 104297 - [i965] Downward causes GPU hangs and misrendering on Haswell
Summary: [i965] Downward causes GPU hangs and misrendering on Haswell
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: vadym
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-16 23:18 UTC by Darius Spitznagel
Modified: 2019-09-25 19:06 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
DRM GPU crash dump (1.58 MB, text/plain)
2017-12-16 23:22 UTC, Darius Spitznagel
Details
Broadwell replay log (144.26 KB, text/plain)
2017-12-19 16:39 UTC, Darius Spitznagel
Details
Hasell replay log (137.94 KB, text/plain)
2017-12-19 16:39 UTC, Darius Spitznagel
Details
GPU hang logs (80.36 KB, application/gzip)
2018-01-26 16:11 UTC, vadym
Details
Downward apitrace replay log with INTEL_DEBUG=check_oob (9.30 MB, text/plain)
2018-01-31 10:55 UTC, Darius Spitznagel
Details
Shader test to reporduce TCS vec4 issue (14.93 KB, text/plain)
2019-01-08 14:22 UTC, Danylo
Details

Description Darius Spitznagel 2017-12-16 23:18:44 UTC
Hello Intel devs,

this game behaves very strange.
The game starts always well to the games menu.

But when you start to play it gets weird.
Sometimes it causes GPU hangs and then crash...

[  690.105553] [drm] GPU HANG: ecode 7:0:0x87d57d10, in Downward [2633], reason: Hang on rcs0, action: reset
[  690.105557] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  690.105558] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  690.105558] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  690.105559] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  690.105560] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  690.105606] i915 0000:00:02.0: Resetting chip after gpu hang
[  697.788068] i915 0000:00:02.0: Resetting chip after gpu hang
[  705.788069] i915 0000:00:02.0: Resetting chip after gpu hang
[  713.788064] i915 0000:00:02.0: Resetting chip after gpu hang

Sometimes it does not hang the GPU, you get into the game and see rendering issues.
The misrendering is always different. Sometimes heavy, sometimes less heavy.

Luckily this game has a free Demo on steam, so you can try this game yourself.
I managed to make an apitrace after some tries but could not see any relevant errors and deleted it. I can do it again if needed.

Setting allow_glsl_cross_stage_interpolation_mismatch=true "seems" not help with  this UE4 game.
Comment 1 Darius Spitznagel 2017-12-16 23:22:03 UTC
Created attachment 136233 [details]
DRM GPU crash dump
Comment 2 Darius Spitznagel 2017-12-16 23:29:07 UTC
I have tested this game with Mesa 17.2.7, 17.3.0 and master 35c3cbad3c30ad3d40a6811dd6ca2286e013bfc5.

All did not work but master did had less GPU hangs.
Comment 3 Darius Spitznagel 2017-12-17 21:37:10 UTC
I have to correct myself.

Now I see errors in the apitrace replay.
They are not visible when I do "apitrace replay TRACE_FILE" but visible when I do "apitrace replay --sb TRACE_FILE".

Here is a snippet...
<<<<
118858: message: shader compiler issue 314: FS SIMD8 shader: 289 inst, 0 loops, 1531 cycles, 0:0 spills:fills, Promoted 9 constants, compacted 4624 to 3360 bytes.
118858: message: shader compiler issue 315: FS SIMD16 shader: 290 inst, 0 loops, 1848 cycles, 0:0 spills:fills, Promoted 9 constants, compacted 4640 to 3376 bytes.
118858: message: shader compiler issue 316: VS vec4 shader: 1095 inst, 0 loops, 9914 cycles, 0:0 spills:fills, compacted 17520 to 14256 bytes.
119938: message: major api error 96: GL_INVALID_OPERATION in glDrawBuffer(invalid buffer GL_BACK)
119938 @2 glDrawBuffer(mode = GL_BACK)
119938: warning: glGetError(glDrawBuffer) = GL_INVALID_OPERATION
122666: message: major api error 96: GL_INVALID_OPERATION in glDrawBuffer(invalid buffer GL_BACK)
122666 @2 glDrawBuffer(mode = GL_BACK)
122666: warning: glGetError(glDrawBuffer) = GL_INVALID_OPERATION
125513: message: major api error 96: GL_INVALID_OPERATION in glDrawBuffer(invalid buffer GL_BACK)
125513 @2 glDrawBuffer(mode = GL_BACK)
125513: warning: glGetError(glDrawBuffer) = GL_INVALID_OPERATION
128948: message: major api error 96: GL_INVALID_OPERATION in glDrawBuffer(invalid buffer GL_BACK)
128948 @2 glDrawBuffer(mode = GL_BACK)
128948: warning: glGetError(glDrawBuffer) = GL_INVALID_OPERATION
131605: message: major api error 96: GL_INVALID_OPERATION in glDrawBuffer(invalid buffer GL_BACK)
131605 @2 glDrawBuffer(mode = GL_BACK)
131605: warning: glGetError(glDrawBuffer) = GL_INVALID_OPERATION
134040: message: shader compiler issue 317: FS SIMD8 shader: 27 inst, 0 loops, 344 cycles, 0:0 spills:fills, Promoted 0 constants, compacted 432 to 304 bytes.
134040: message: shader compiler issue 318: FS SIMD16 shader: 28 inst, 0 loops, 392 cycles, 0:0 spills:fills, Promoted 0 constants, compacted 448 to 320 bytes.
>>>>>
Comment 4 Darius Spitznagel 2017-12-18 11:23:22 UTC
Today I tested the apitrace on an weak Broadwell...

OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) Broadwell GT1 
OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.3.0 (git-49a612d158)
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 17.3.0 (git-49a612d158)
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.1 Mesa 17.3.0 (git-49a612d158)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10
OpenGL ES profile extensions:

There I had the same errors on replay but "NO GPU hangs" and the trace was "rendered correctly".

So it's definitely a Haswell problem.

On both systems I have tested the trace with Kernel 4.9.68.
Comment 5 Darius Spitznagel 2017-12-18 22:08:12 UTC
When the replay doesn't run through completely (crash) I always get...

i965: Failed to submit batchbuffer: Invalid argument

...as last output.

This leads me to "src/mesa/drivers/dri/i965/intel_batchbuffer.c" and to submit_batch() which tells something about...

      /* The requirement for using I915_EXEC_NO_RELOC are:
       *
       *   The addresses written in the objects must match the corresponding
       *   reloc.gtt_offset which in turn must match the corresponding
       *   execobject.offset.
       *
       *   Any render targets written to in the batch must be flagged with
       *   EXEC_OBJECT_WRITE.
       *
       *   To avoid stalling, execobject.offset should match the current
       *   address of that object within the active context.
       */

avoid stalling = GPU hang?
match current address...within active context = Invalid argument?!
Comment 6 Darius Spitznagel 2017-12-19 16:38:10 UTC
I think I have found the problem.

On the Haswell system the statebuffer is NOT growing.

I will attach both replay logs.
Comment 7 Darius Spitznagel 2017-12-19 16:39:02 UTC
Created attachment 136285 [details]
Broadwell replay log
Comment 8 Darius Spitznagel 2017-12-19 16:39:42 UTC
Created attachment 136286 [details]
Hasell replay log
Comment 9 Darius Spitznagel 2017-12-19 16:44:09 UTC
(In reply to Darius Spitznagel from comment #8)
> Created attachment 136286 [details]
> Hasell replay log

Oops, I meant Haswell replay log.
Comment 10 Darius Spitznagel 2017-12-19 16:50:32 UTC
When I'm not wrong Kenneth is working on growing the batch- and statebuffer.

Maybe he can share some light?
Comment 11 Kenneth Graunke 2017-12-20 08:02:31 UTC
Interesting - nice find, Darius!  I did have one remaining patch that never landed but may fix GPU hangs related to statebuffer growing:

https://cgit.freedesktop.org/~kwg/mesa/commit/?h=growbo&id=3bfda7ae1935105940c7f3d22fd77b8b3b6e65cf

Could you try that patch, or the 'growbo' branch of my tree?
Comment 12 Darius Spitznagel 2017-12-20 12:39:02 UTC
OK, i tried both and both did not work.
Still GPU hangs and "i965: Failed to submit batchbuffer: Invalid argument".

One think I did not mention is that sometimes replaying the trace does also corrupt the desktop (mate in my case) or even crash the session (lightdm in my case).
A fences problem? I use libdrm 2.4.88 FWIW.
Comment 13 Darius Spitznagel 2017-12-20 13:50:35 UTC
Let me know if you need the download link to my trace file if you don't like to install the demo from steam.
Comment 14 Darius Spitznagel 2017-12-21 10:47:11 UTC
Good news Ken,

as of today the GPU hangs are much less severe.
I think its because of Brian Pauls recent GLSL patches which landed in mesa master.

Especially I think this one...
glsl: disable vec3 packing/splitting in tfb separate mode (6e5b882339e9128348f0e7b828230f07338fce55)

The GPU hang error changed from...
[  690.105553] [drm] GPU HANG: ecode 7:0:0x87d57d10, in Downward [2633], reason: Hang on rcs0, action: reset

to...
[ 1415.920704] [drm] GPU HANG: ecode 7:0:0x85dffffc, in glretrace [1052], reason: Hang on render ring, action: reset

The failed "Failed to submit batchbuffer: Invalid argument" error still happens sometimes, misrendering ist persistent and still no statebuffer growing in replay output.

Currently I run an endless loop replaying the trace to check if the replay kills my desktop session which it did yesterday a couple of times.

I will run this test for an hour and then also apply your patch from Comment 11 and report back.
Comment 15 Darius Spitznagel 2017-12-21 12:42:51 UTC
I'm back.
OK, desktop corruption still occurs but NO session crash!

In the meantime I checked your repos and saw your shiny new "i965: Reduce GL_MAX_*_SAMPLES on Gen7-7.5" patch.
Mesa is compiling with it right now...
Comment 16 Darius Spitznagel 2017-12-21 13:27:26 UTC
No VISIBLE changes with "i965: Reduce GL_MAX_*_SAMPLES on Gen7-7.5".

Summary so far:
Mesa master 9f54675dbe01518ec4b71e8fc9b4f6e777b27185

+ no desktop crash anymore (recent changes to GLSL)
- sometimes graphical corruption on desktop while replay
- sometimes GPU hangs: changed from "Hang on rcs0" to "Hang on render ring"
- always misrendering ingame
- sometimes replay crash > i965: Failed to submit batchbuffer: Invalid argument
- statebuffer is not growing

Specs:
Xorg 1.19.5
Libdrm 2.4.88
Kernel 4.9.68
DDX modesetting
iGPU Haswell
Comment 17 Darius Spitznagel 2017-12-28 08:47:44 UTC
How can I dump the validation list with your growbo tree, Kenneth?
Comment 18 Darius Spitznagel 2018-01-11 10:12:39 UTC
Hello Kenneth,

I have uncommented the the "dump_validation_list" block...

changed "UNUSED static void" to "static void" but don't see any "Validation list" output in the replay.

Please help.
Comment 19 vadym 2018-01-26 16:10:48 UTC
Issue is reproducible on my laptop:

OS: Ubuntu 16.04 LTS 64-bit
CPU: Intel® Core™ i5-4310M CPU @ 2.70GHz × 4
GPU: Intel® Haswell Mobile
mesa: OpenGL ES 3.2 Mesa 17.0.7 (17.3.1)
kernel: 4.13.0-26-generic

GPU hang is 100% reproducible when new game is started. 
Tried with mesa 17.0.7 and 17.3.1.

With the newest development mesa 18.1.0-devel (git-e28233a527) GPU hang in no longer reproducible. But there are some rendering issues presented (I'll provide attachments for this).
Comment 20 vadym 2018-01-26 16:11:56 UTC
Created attachment 136977 [details]
GPU hang logs
Comment 21 Darius Spitznagel 2018-01-27 15:21:02 UTC
> With the newest development mesa 18.1.0-devel (git-e28233a527) GPU hang in no longer reproducible. But there are some rendering issues presented (I'll provide attachments for this).

Are you sure? How often did you start Downward?
As mentioned in Comment 14 the GPU hangs do occur less since 21st december 2017.

I will try this on monday where I have access to a Haswell system since now I have a Broadwell one.
Comment 22 Darius Spitznagel 2018-01-29 08:38:26 UTC
With todays master I still get GPU hangs...

[Mo Jan 29 09:18:59 2018] [drm] GPU HANG: ecode 7:0:0x85dffed2, in glretrace [32025], reason: Hang on render ring, action: reset
[Mo Jan 29 09:18:59 2018] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[Mo Jan 29 09:18:59 2018] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[Mo Jan 29 09:18:59 2018] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[Mo Jan 29 09:18:59 2018] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[Mo Jan 29 09:18:59 2018] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[Mo Jan 29 09:18:59 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:09 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:19 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:28 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:38 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:46 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:19:54 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:20:04 2018] drm/i915: Resetting chip after gpu hang
[Mo Jan 29 09:28:50 2018] drm/i915: Resetting chip after gpu hang

This test was made with an endless loop replaying my apitrace of Downward.
Comment 23 vadym 2018-01-29 11:53:01 UTC
Hi Darius,

You are right! Issue is still reproducible. It is crashing on new game start. On my laptop it happens in ~25% cases.

Also I noticed steamclient.so is receiving segmentation fault signal. Not sure if it a root cause of the issue but anyway log is below:

i965: Failed to submit batchbuffer: Input/output error

Thread 10 "CFileWriterThre" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeed1d700 (LWP 29056)]
0x00007ffff04776ec in ?? () from /home/vadym/.local/share/Steam/linux64/steamclient.so
(gdb) bt
#0  0x00007ffff04776ec in ?? () from /home/vadym/.local/share/Steam/linux64/steamclient.so
#1  0x00007ffff0477e75 in ?? () from /home/vadym/.local/share/Steam/linux64/steamclient.so
#2  0x00007ffff0479fd2 in ?? () from /home/vadym/.local/share/Steam/linux64/steamclient.so
#3  0x00007ffff794a6ba in start_thread (arg=0x7fffeed1d700) at pthread_create.c:333
#4  0x00007ffff6b0141d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb)
Comment 24 Mark Janes 2018-01-29 16:15:44 UTC
Could you share the apitrace with the i965 mesa team so we can reproduce this?
Comment 25 Darius Spitznagel 2018-01-29 17:02:25 UTC
Here you can get it...
http://www.goodbytez.de/mesa/Downward.trace.tar.bz2

FWIW he trace was captured in windowed mode and all effects to low/off to keep the trace tiny.
Comment 26 vadym 2018-01-30 17:21:53 UTC
Looks like issue was fixed by the following commit:

i965: Emit PIPE_CONTROL with ISP bit on older platforms.

Darius,

Can you please check it on your side ?
Comment 27 Darius Spitznagel 2018-01-31 08:24:32 UTC
Sadly no - nearly nothing changed.

Still...
- sometimes graphical corruption on desktop while replay
- sometimes GPU hangs
- always misrendering in gameplay scene
- sometimes replay crash > "i965: Failed to submit batchbuffer: Invalid argument" changed to "i965: Failed to submit batchbuffer: Input/output error"
- statebuffer is not growing like for Broadwell
Comment 28 Darius Spitznagel 2018-01-31 10:53:15 UTC
Maybe this helps.

I've applied Kevins "GEM BO padding to find OOB buffer writes" patch series to master ef272b161e05e8216f2d1f4df5023f3aed0ae4fa from here...
https://patchwork.freedesktop.org/series/35080/

Replayed my trace file in a loop for >30 minutes with...

INTEL_DEBUG=check_oob apitrace replay Downward.trace >> downward_debug.log 2>&1

The patch detected lots of "out-of-bounds write from brw_bo".

"cat downward_debug.log | grep Detected" shows them all.
Comment 29 Darius Spitznagel 2018-01-31 10:55:30 UTC
Created attachment 137084 [details]
Downward apitrace replay log with INTEL_DEBUG=check_oob
Comment 30 vadym 2018-02-15 10:39:51 UTC
Issue is still reproducible on the recent mesa 18.1.0-devel (git-c6694793e1) and with the one of the latest kernels 4.15.3-041503-generic
Comment 31 Mark Janes 2018-02-16 00:26:15 UTC
vadym: can you please bisect this in mesa?
Comment 32 vadym 2018-02-16 12:01:48 UTC
Hi Mark,

I'm not sure we can easily bisect bad commit. When OpenGL version is less than 4.3 Downward game complains and actually not started at all. Below is the last commit in mesa which I could use for testing:

commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Jan 13 22:53:34 2017 -0800

    i965: Enable OpenGL 4.5 on Haswell.
    
    Everything is in place and the test results look solid.
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Matt Turner <mattst88@gmail.com>
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

With older Mesa commits Downward game is not run. I also tried more older commit:

commit 75968a668e44b3fd7c9b9277937c005366fca116
Author: Juan A. Suarez Romero <jasuarez@igalia.com>
Date:   Tue Oct 11 15:05:36 2016 +0000

    i965/gen7: expose OpenGL 4.2 on Haswell when supported
    
    GL_ARB_vertex_attrib_64bit was the last piece missing.
    
    v2: update docs (Jordan)
    
    Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
    Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

and manually changed OpenGL version to 4.5 in mesa sources to make Downward run. But anyway issue is still reproducible there.
Comment 33 Mark Janes 2018-02-16 16:40:31 UTC
Thanks for the information.  This bug can't be considered a regression in our current release candidate it if was triggered by enabling new features.
Comment 34 Vladimir Los 2018-03-16 10:59:13 UTC
Platform: HP Z220 SFF Workstation
SKU: ASJ45AV
CPU: i5-3470 @ 3.1 GHz x 4, stepping: 000306A9 00000019, Ivy Bridge
RAM: DDR3 1600MHz (used several options in 2 channels and 4 dimms (1 or 2 memunits x 2Gb))
OS:  Ubuntu 16.04(.4) LTS 64-bit 
Mesa: 17.2.8

apitrace doesn’t allow to replay the trace due to unability to create OpenGL 4.3+ context because of used HW (IvyBridge, allowed max version is 4.2).
Comment 35 Darius Spitznagel 2018-03-16 13:21:27 UTC
(In reply to Vladimir Los from comment #34)

> apitrace doesn’t allow to replay the trace due to unability to create OpenGL
> 4.3+ context because of used HW (IvyBridge, allowed max version is 4.2).

You need at least Haswell which has OpenGL 4.5 support to replay the apitrace.
Comment 36 Samuel Sieb 2018-04-15 02:07:19 UTC
I just had my Gnome desktop crash and the only info in the log was:
i965: Failed to submit batchbuffer: Bad address

This is on Fedora 27, kernel 4.15.9, mesa 17.3.6.
Possibly related to Bug 104778.
Comment 37 Julien Isorce 2018-07-29 20:21:42 UTC
Hi, I can get "i965: Failed to submit batchbuffer: Invalid argument" when 2 GL contexts are shared together (shareList) but both where created with a different "Display* dpy" instance and one thread using Xlib and the other one using libxcb (the later also calling XSetEventQueueOwner(XCBOwnsEventQueue)). The solution for me was to make the 2 context from the same display and porting any XLib usage to libxcb.
Comment 38 Danylo 2019-01-08 14:21:23 UTC
Hello,

I'm not sure that there are still hangs, I have tested a demo version of Downward and didn't experienced any.

However I have found at least two issues with tesselation control shader:

1) Unmatched outputs in TCS are being reduced to local variables, I made a fix, see https://gitlab.freedesktop.org/mesa/mesa/merge_requests/59 - it will be probably merged soon.

2) The second is more interesting:
  In short: something is broken with vec4 TCS shader when registers are spilled.
  
  Long version: 
    Consider next part of TCS:
    -----------
    layout(location=0) out vec4 out1[3];
    layout(location=1) out vec4 out2[3];

    void main() {
        out1[gl_InvocationID] = gl_in[gl_InvocationID].gl_Position;
        out2[gl_InvocationID] = gl_in[gl_InvocationID].gl_Position;

        barrier();

        if (out1[0] != out2[0] || out1[1] != out2[1] || out1[2] != out2[2]) {
            atomicCounterIncrement(mismatches);
        } else {
            atomicCounterIncrement(matches);
        }
    -----------

    While 'out1' should always match 'out2' it isn't the case when launching on any platform with:
> INTEL_DEBUG=spill_vec4 INTEL_SCALAR_TCS=0
    Or just launching on Haswell.
    If you add
> hs.SingleProgramFlow = true;
    to '3DSTATE_HS' the 'matches' will be zero most times but also not always.
    The incorrect values in these arrays are not garbage when mismatched but copies of nearby values (saw this in one iteration of testing in RenderDoc).
Comment 39 Danylo 2019-01-08 14:22:14 UTC
Created attachment 143012 [details]
Shader test to reporduce TCS vec4 issue
Comment 40 Kenneth Graunke 2019-05-17 17:12:13 UTC
(In reply to Danylo from comment #39)
> Created attachment 143012 [details]
> Shader test to reporduce TCS vec4 issue

For what it's worth, hacking the DEBUG_SPILL_VEC4 code to do:

         if (no_spill[i] || i != 42)                            
            continue;

will still reproduce the problem on Kabylake with INTEL_SCALAR_TCS=0 INTEL_DEBUG=spill_vec4, but it only spills and fills a single register.  That spill/fill appears to be working in the simulator, too - the value read back matches the value written.
Comment 41 Kenneth Graunke 2019-05-17 17:31:06 UTC
Check out the resulting assembly:

mov(8)          g22<1>UD        0x00000000UD                    { align1 WE_all 1Q compacted };
mov(1)          g22.5<1>UD      0x0000ff00UD                    { align1 WE_all 1N };
mov(2)          g22<1>UD        g0<0,1,0>UD                     { align1 WE_all 1N };
mov(2)          g22.3<1>UD      g15<4,1,0>UD                    { align1 WE_all 1N };
mov(8)          g125<1>UD       g0<4>UD                         { align16 WE_all 1Q };
mov(1)          g126<1>UD       0x00000000UD                    { align1 WE_all 1N compacted };
mov(1)          g126.4<1>UD     0x00000001UD                    { align1 WE_all 1N compacted };
mov(8)          g127<1>D        g22<4>D                         { align16 1Q };
send(8)         null<1>UW       g125<0>.xF      0x060a80fd
                            data MsgDesc: ( DC OWORD dual block write, 253, 0) mlen 3 rlen 0 { align16 1Q };

g22 is a message header produced by a series of WE_all instructions.  It's then copied to g127 without WE_all set, which will only copy some of the channels.  It's then partially spilled and later partially reloaded.   Guessing that's your bug.  I'll leave it to you.
Comment 42 Danylo 2019-05-20 14:27:00 UTC
Tested with 'WE_all' and it is not a culprit here. However at least the issue is localised so I'm investigating further.
Comment 43 Danylo 2019-05-24 09:39:25 UTC
Again I'm out of ideas is there anything else?
Comment 44 GitLab Migration User 2019-09-25 19:06:41 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1665.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.