Bug 92100 - [ILK] GPU hang (and multiple fails) in piglit shader tests if spilling is forced in the vec4 backend
Summary: [ILK] GPU hang (and multiple fails) in piglit shader tests if spilling is for...
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-24 08:52 UTC by Iago Toral
Modified: 2019-09-25 18:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Native code producing the GPU hang (92.03 KB, text/plain)
2015-09-24 12:37 UTC, Iago Toral
Details
Minimal shader test program reproducing the problem (456 bytes, text/plain)
2015-10-06 08:25 UTC, Iago Toral
Details
Minimal native code reproducing the problem (3.30 KB, text/plain)
2015-10-06 08:28 UTC, Iago Toral
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Iago Toral 2015-09-24 08:52:04 UTC
Running piglit's shader tests (~14000 tests) while forcing spilling on the vec4 backend (INTEL_DEBUG=spill_vec4) produces at least one GPU hang (for me this happens consistently around test 10400). Also, I see a lot of tests that stop passing, even before the GPU hang, which points at something seriously broken. The FS backend is not affected, even if INTEL_DEBUG=spill_fs is used to force spilling.

All this is with current master (1614c39a8fc205)

To reproduce:

INTEL_DEBUG=spill_vec4 ./piglit-run.py tests/shader results/shader

I'll see if I can add more info, at least the specific test triggering the hang.
Comment 1 Iago Toral 2015-09-24 12:37:36 UTC
Created attachment 118430 [details]
Native code producing the GPU hang

The test triggering the GPU hang is 'vs-loop-bounds-unrolled.shader_test'. The attachment includes the native VS code dump when spilling is forced. I'll try to look into it tomorrow.
Comment 2 Iago Toral 2015-09-25 13:09:27 UTC
Some additional information:

The hang happens because scratch reads/writes are not working properly for some reason. In this particular test, the GPu hang is the result of an infinite loop because variables involved in the loop condition are always read a value of 0 when unspilled.

I have been reviewing the ILK docs and the implementation but I have not found anything wrong with it yet. I'll attach a minimum shader_test program reproducing the problem soon.
Comment 3 Iago Toral 2015-10-06 08:25:41 UTC
Created attachment 118701 [details]
Minimal shader test program reproducing the problem
Comment 4 Iago Toral 2015-10-06 08:28:25 UTC
Created attachment 118702 [details]
Minimal native code reproducing the problem

This is the native code for the minimal example I just attached, forcing spilling only for the the variable "sum". Since that variable is not part of the loop condition the program won't hang, but we still get an incorrect result (sum is 0). Notice that in this case the problem is limited to just one spill/unspill operation that does not work well for some reason.
Comment 5 Rhys Kidd 2016-06-05 15:32:06 UTC
I attempted to recreate this with spilling on ILK with a more recent Mesa. As Iago outlined with the minimal reproducer test, the shader_test didn't hang but it did fail with INTEL_DEBUG=spill_vec4.

$ INTEL_DEBUG=spill_vec4 mesa-debug bin/shader_runner vs-ilk-spill-fail.shader_test -auto 
Probe color at (0,0)
  Expected: 0 255 0 255
  Observed: 0 0 0 255
PIGLIT: {"result": "fail" }

$ mesa-debug bin/shader_runner vs-ilk-spill-fail.shader_test -auto
PIGLIT: {"result": "pass" }

$ mesa-debug report-bug-info 
Software versions:
    4.4.0-22-generic
    OpenGL version string: 2.1 Mesa 12.1.0-devel (git-f016e4b)

GPU hardware:
    OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile 
    00:02.0 VGA compatible controller [0300]: Intel Corporation Core Processor Integrated Graphics Controller [8086:0046] (rev 02)

CPU hardware:
    x86_64
    Intel(R) Core(TM) i5 CPU       M 580  @ 2.67GHz
Comment 6 GitLab Migration User 2019-09-25 18:54:37 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1493.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.