Bug 106002 - GPU HANG: ecode 9:2:0xa8dfbffd
Summary: GPU HANG: ecode 9:2:0xa8dfbffd
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-12 12:02 UTC by Jiri Slaby
Modified: 2018-09-10 13:14 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (28.73 KB, text/plain)
2018-04-12 12:02 UTC, Jiri Slaby
no flags Details
dmesg (688.67 KB, text/plain)
2018-04-12 12:02 UTC, Jiri Slaby
no flags Details

Description Jiri Slaby 2018-04-12 12:02:06 UTC
Created attachment 138782 [details]
/sys/class/drm/card0/error

[15146.471169] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, in mpv/vo [18693], reason: Hang on vcs0, action: reset
[15146.471173] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[15146.471176] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[15146.471178] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[15146.471180] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[15146.471182] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[15146.471566] i915 0000:00:02.0: Resetting vcs0 after gpu hang
[15159.422798] i915 0000:00:02.0: Resetting vcs0 after gpu hang
[15168.414948] i915 0000:00:02.0: Resetting vcs0 after gpu hang
Comment 1 Jiri Slaby 2018-04-12 12:02:57 UTC
Created attachment 138783 [details]
dmesg
Comment 2 Chris Wilson 2018-04-12 12:40:30 UTC
Sigh, libva left the bugs.fd.o collective.
Comment 3 Jani Saarinen 2018-04-25 11:56:24 UTC
Chris, is this valid issue for i915?
Comment 4 Lionel Landwerlin 2018-05-04 13:41:25 UTC
It's unfortunate that the batchbuffers are not flagged by userspace so that we could look at what caused the hang.

The instruction on which the command streamer hanged (MI_FLUSH_DW) doesn't appear to be something that would be emitted by intel-vaapi-driver (difference in the set bits).
It looks more like something from gen6_bsd_ring_flush().
Though that's really confusing because I wouldn't expect anybody to run with legacy submission on SKL & 4.16.
Comment 5 Jani Saarinen 2018-05-17 09:58:08 UTC
Jiri, is this still valid on latest drm-tip?
Comment 6 Lakshmi 2018-09-10 12:26:01 UTC
Jiri, ping?
Comment 7 Jiri Slaby 2018-09-10 12:48:12 UTC
I don't think I saw it recently.
Comment 8 Lakshmi 2018-09-10 13:14:36 UTC
I assume this issue has been fixed.
Closing now. Feel free to reopen if you still have the issue with latest drm-tip (https://cgit.freedesktop.org/drm-tip).

If the problem persists attach the full dmesg from boot with kernel parameters drm.debug=0x1e log_buf_len=4M.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.