Bug 94022 - [SKL] GPU HANG: ecode 9:0:0x87f9bffb, in chrome [1748], reason: Ring hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x87f9bffb, in chrome [1748], reason: Ring hung, ac...
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
: 94864 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-02-06 12:31 UTC by Fredrik Lindner
Modified: 2017-02-10 22:39 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (82.74 KB, text/plain)
2016-02-06 12:31 UTC, Fredrik Lindner
Details
dmesg (72.47 KB, text/plain)
2016-02-06 12:31 UTC, Fredrik Lindner
Details

Description Fredrik Lindner 2016-02-06 12:31:27 UTC
Created attachment 121547 [details]
/sys/class/drm/card0/error

As others have reported, I get these GPU hang errors whenever there is any hardware acceleration present (chrome, firefox, webgl etc).
It happens on all kernel versions I've tried so far (4.2, 4.3, 4.4, 4.5-rc1, 4.5 nightly as of 2016-02-06).

The hardware is an Intel NUC6i5SYH (Intel Skylake i5 6260u, Intel Iris 540).
Attached is the error log from /sys/class/drm/card0/error and dmesg.
Comment 1 Fredrik Lindner 2016-02-06 12:31:55 UTC
Created attachment 121548 [details]
dmesg
Comment 2 Fredrik Lindner 2016-02-07 08:38:11 UTC
Some more digging around:
This does not seem to happen on kernel 4.2.0-16-generic #19-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux.

This is with i915.preliminary_hw_support=1 in the boot params. Having the same option turned on in newer kernels still causes crashes/freezes (I guess it's being ignored anyway).
Comment 3 Yuval Adam 2016-02-07 10:48:12 UTC
This is likely a duplicate of ticket 94002 but it's good to get confirmation this is a widespread problem with Skylake graphics + hardware acceleration.

FYI, i915.preliminary_hw_support=1 is useless in kernel >= 4.3.x since Skylake is no longer considered preliminary in those versions.
Comment 4 yann 2016-05-18 15:21:58 UTC
*** Bug 94864 has been marked as a duplicate of this bug. ***
Comment 5 yann 2016-09-13 10:28:38 UTC
(In reply to Yuval Adam from comment #3)
> This is likely a duplicate of ticket 94002 but it's good to get confirmation
> this is a widespread problem with Skylake graphics + hardware acceleration.
> 
> FYI, i915.preliminary_hw_support=1 is useless in kernel >= 4.3.x since
> Skylake is no longer considered preliminary in those versions.

right i915.preliminary_hw_support=1 shouldn't be used for SKL, but even both 94022 and 94002 have hung in batch render ring, it is happening on different instruction and sequences.
Comment 6 yann 2016-09-13 10:37:42 UTC
There were workarounds on SKL pushed in kernel (since the one you are using 4.5.0-994-generic), so please re-test with latest kernel and without using i915.preliminary_hw_support=1 to see if it has some benefits on that work.

In parallel, assigning to Mesa product (please let me know if I am mistaken with this GPU Hang).

From this error dump, hung is happening in render ring batch with active head at 0xfd47559c, with 0x78260000 (3DSTATE_BINDING_TABLE_POINTERS_VS) as IPEHR.

Batch extract (around 0xfd47559c):

0xfd475568:      0x78170009: 3D UNKNOWN: 3d_965 opcode = 0x7817
0xfd47556c:      0x00000000: MI_NOOP
0xfd475570:      0x00000001: MI_NOOP
0xfd475574:      0x00000000: MI_NOOP
0xfd475578:      0x00000000: MI_NOOP
0xfd47557c:      0x00000000: MI_NOOP
0xfd475580:      0x00000000: MI_NOOP
0xfd475584:      0xfd476960: UNKNOWN
0xfd475588:      0x00000000: MI_NOOP
0xfd47558c:      0x00000000: MI_NOOP
0xfd475590:      0x00000000: MI_NOOP
0xfd475594:      0x78260000: 3DSTATE_BINDING_TABLE_POINTERS_VS
0xfd475598:      0x000048e0:    dword 1
0xfd47559c:      0x782a0000: 3DSTATE_BINDING_TABLE_POINTERS_PS
0xfd4755a0:      0x000048c0:    dword 1
0xfd4755a4:      0x784a0000: 3D UNKNOWN: 3d_965 opcode = 0x784a
0xfd4755a8:      0x00000000: MI_NOOP
0xfd4755ac:      0x78080003: 3DSTATE_VERTEX_BUFFERS
0xfd4755b0:      0x00044018:    buffer 0: sequential, pitch 24b
0xfd4755b4:      0xfdd22000:    buffer address
0xfd4755b8:      0x00000000:    max index
0xfd4755bc:      0x00001000:    mbz
Comment 7 yann 2016-11-04 15:32:15 UTC
Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.
Comment 8 Annie 2017-02-10 22:39:08 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.