Bug 105305

Summary: [skl] GPU HANG: ecode 9:0:0x85dfbeff, in Xorg
Product: Mesa Reporter: Sjoerd Mullender <sjoerd>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: GPU crash dump from /sys/class/drm/card0/error

Description Sjoerd Mullender 2018-03-01 09:08:46 UTC
Created attachment 137714 [details]
GPU crash dump from /sys/class/drm/card0/error

Some pertinent lines from journalctl:

Mar 01 09:49:30 HOSTNAME kernel: [drm] GPU HANG: ecode 9:0:0x85dfbeff, in Xorg [882], reason: Hang on rcs0, action: reset
Mar 01 09:49:30 HOSTNAME kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Mar 01 09:49:30 HOSTNAME kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Mar 01 09:49:30 HOSTNAME kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Mar 01 09:49:30 HOSTNAME kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Mar 01 09:49:30 HOSTNAME kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Mar 01 09:49:30 HOSTNAME kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Mar 01 09:49:38 HOSTNAME kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Mar 01 09:49:46 HOSTNAME kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Mar 01 09:49:54 HOSTNAME kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Mar 01 09:50:02 HOSTNAME kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Mar 01 09:50:02 HOSTNAME at-spi-bus-launcher[1594]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Mar 01 09:50:02 HOSTNAME at-spi-bus-launcher[1594]:       after 110868 requests (110868 known processed) with 0 events remaining.
Mar 01 09:50:02 HOSTNAME unknown[1567]: xfce4-notifyd: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.

After this several more programs report failures and crashes, the screen went black and came back with the login screen.  Last entry before these lines was about a half hour earlier.
Comment 1 Chris Wilson 2018-03-01 09:13:26 UTC
Please try mesa-17.3.6 as it should fix this issue.
Comment 2 Sjoerd Mullender 2018-03-01 09:29:32 UTC
Thanks for the quick response.  I'll give it a whirl.
Comment 3 Sjoerd Mullender 2018-03-06 15:33:23 UTC
I ran with the updated mesa package for some time and haven't experienced any crashes anymore.  It seems mesa-17.3.6 did indeed fix the problem.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.