Bug 93506 - [SKL] GPU HANG: ecode 9:0:0x87f97cf9, in Never_Alone.x64 [13425], reason: Ring hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x87f97cf9, in Never_Alone.x64 [13425], reason: Rin...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-25 22:20 UTC by Martin Schrodt
Modified: 2018-07-26 06:11 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
Output of cat /sys/class/drm/card0/error (534.32 KB, text/plain)
2015-12-25 22:20 UTC, Martin Schrodt
Details

Description Martin Schrodt 2015-12-25 22:20:24 UTC
Created attachment 120687 [details]
Output of cat /sys/class/drm/card0/error

I bought the game "Never Alone" on Steam and started playing. It worked for a minute and then hung, eventually closing the game.

Looking through dmesg I found a notice that suggested to report this here.

CPU is i7-6700hq, Kernel 4.4.0-rc6, Xorg driver xf86-video-intel 1:2.99.917+519+g8229390-1, Arch Linux.

If you need more information, I'd be happy to come up with it :)
Comment 1 yann 2016-09-16 08:12:06 UTC
There were workaround for SKL and improvements pushed in kernel and Mesa that will benefit to your system, so please re-test with latest kernel & Mesa to see if this issue is still occurring (you may also collect and attach logs collected thanks to apitrace: http://apitrace.github.io/)

Kernel: 4.4.0-rc6-mainline
Platform: SKL - i7-6700h
Mesa: [Please confirm your mesa version]

In the meantime, assigning to Mesa product.

From this error dump, hung is happening in render ring batch with active head at 0xf8026324, with 0x78260000 (3DSTATE_BINDING_TABLE_POINTERS_VS) as IPEHR.

Batch extract (around 0xf8026324):

0xf8026318:      0x00000000: MI_NOOP
0xf802631c:      0x78260000: 3DSTATE_BINDING_TABLE_POINTERS_VS
0xf8026320:      0x00000000:    dword 1
0xf8026324:      0x782a0000: 3DSTATE_BINDING_TABLE_POINTERS_PS
0xf8026328:      0x00007a60:    dword 1
0xf802632c:      0x782f0000: 3DSTATE_SAMPLER_STATE_POINTERS_PS
0xf8026330:      0x00007a40:    dword 1
Bad length 9 in 3DSTATE_VS, expected 6-6
0xf8026334:      0x78100007: 3DSTATE_VS
0xf8026338:      0x00099c00:    kernel pointer
0xf802633c:      0x00000000:    SPF=0, VME=0, Sampler Count 0, Binding table count 0
0xf8026340:      0x00010000:    scratch offset
0xf8026344:      0x00000000:    Dispatch GRF start 0, VUE read length 0, VUE read offset 0
0xf8026348:      0x00000000:    Max Threads 1, Vertex Cache enable, VS func disable
Comment 2 yann 2016-11-04 15:38:58 UTC
Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.

If you can reproduce, please capture and upload an apitrace (https://github.com/apitrace/apitrace) so that we can easily 
reproduce as well.
Comment 3 Martin Schrodt 2016-11-07 08:12:00 UTC
The GPU hangs still happen as of today.

➜  ~  uname -a                           
Linux revolution 4.8.6-1-ARCH #1 SMP PREEMPT Mon Oct 31 18:51:30 CET 2016 x86_64 GNU/Linux

➜  ~  pacman -Qi mesa | grep Version
Version         : 13.0.0-1

I made an apitrace, it's a mere 2GB however, will try to upload as an attachment now ;-)
Comment 4 Martin Schrodt 2016-11-07 08:17:04 UTC
Upload is too big...

You can download the trace here:

http://schrodt.org/Never_Alone.x64.trace.bz2 (2.1GB)

Uploading now, ETA 30min.
Comment 5 Anuj Phogat 2016-11-18 18:56:02 UTC
Reproduced this hang with mesa master commit 3ff9f8c on SKL, kernel 4.7.10-100.fc23.x86_64. Attached apitrace is too big. I'll try to trim it down to a reasonable size where I can do an aubdump.
Comment 6 Denis 2018-06-27 13:49:49 UTC
Hello. Thanks for apitrace. I found out that I can reproduce this issue on:
Manjaro
OpenGL version string: 3.0 Mesa 13.0.0 (git-df1b0a5a86)
SKL
_______
[864073.199929] [drm] GPU HANG: ecode 9:0:0x84df7cfc, in glretrace [13985], reason: Hang on rcs0, action: reset
[864073.199938] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[864077.242807] asynchronous wait on fence i915:Xorg[1782]/0:481d9 timed out
[864081.296258] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[864089.189584] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[864097.082928] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[864100.282801] asynchronous wait on fence i915:Xorg[1782]/0:481e1 timed out
[864105.189590] i915 0000:00:02.0: Resetting rcs0 after gpu hang
______________________

But issue is not actual on mesa 18.1.0 (from manjaro repo).

So issue can be closed as fixed somewhere between.
Comment 7 Tapani Pälli 2018-07-26 06:11:36 UTC
resolving, see comment #6


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.