105135 – [skl] GPU hanging in Xorg (ecode 9:0:0x85dffffb)

Bug 105135 - [skl] GPU hanging in Xorg (ecode 9:0:0x85dffffb)

Summary: [skl] GPU hanging in Xorg (ecode 9:0:0x85dffffb)

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/i965 (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Intel 3D Bugs Mailing List
QA Contact:	Intel 3D Bugs Mailing List

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2018-02-16 17:25 UTC by Augusto Caringi
Modified:	2018-03-14 23:07 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
GPU crash dump (61.00 KB, text/plain) 2018-02-16 17:25 UTC, Augusto Caringi	Details
/sys/class/drm/card0/error (44.52 KB, text/plain) 2018-02-26 16:24 UTC, Nathan Sidwell	Details
View All

Description Augusto Caringi 2018-02-16 17:25:47 UTC

Created attachment 137401 [details]
GPU crash dump

Hi,

    My system is a Fedora 27 (updated) and sometimes my Xorg is crashing:

[264950.669478] [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [1372], reason: Hang on rcs0, action: reset
[264950.669482] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[264950.669484] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[264950.669486] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[264950.669487] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[264950.669490] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[264950.669504] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[264962.616880] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[264978.616745] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[264992.632632] i915 0000:00:02.0: Resetting rcs0 after gpu hang
[265008.632540] i915 0000:00:02.0: Resetting rcs0 after gpu hang

$ lspci|grep VGA
00:02.0 VGA compatible controller: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)


    I'm attaching the crash dump from /sys/class/drm/card0/error

Comment 1 Kenneth Graunke 2018-02-17 21:09:07 UTC

What version of Mesa are you using?  I believe you'll experience hangs unless you update to 17.3.4 or the latest 18rc.

Comment 2 Augusto Caringi 2018-02-19 09:26:27 UTC

(In reply to Kenneth Graunke from comment #1)
> What version of Mesa are you using?  I believe you'll experience hangs
> unless you update to 17.3.4 or the latest 18rc.

Thanks for the information...

I was using mesa 17.3.3...

$ glxinfo | grep "OpenGL version"
OpenGL version string: 3.0 Mesa 17.3.3

But today my system (Fedora 27) received a mesa update to version 17.3.4...

Let's see if the problem goes way... :)

Thanks again!

Comment 3 Nathan Sidwell 2018-02-26 16:24:03 UTC

Created attachment 137610 [details]
/sys/class/drm/card0/error

Comment 4 Nathan Sidwell 2018-02-26 16:24:37 UTC

I'm observing this today, freshly updated Fedora 27

nathans@lyta:1>lspci|grep VGA
00:02.0 VGA compatible controller: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)
00:13.0 Non-VGA unclassified device: Intel Corporation Sunrise Point-LP Integrated Sensor Hub (rev 21)

nathans@lyta:1>glxinfo | grep "OpenGL version"
OpenGL version string: 3.0 Mesa 17.3.5

Feb 26 10:20:52 lyta kernel: usb 2-1.3.3: reset SuperSpeed USB device number 10 using xhci_hcd
Feb 26 10:22:37 lyta kernel: [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [844], reason: Hang on rcs0, action: reset
Feb 26 10:22:37 lyta kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Feb 26 10:22:37 lyta kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Feb 26 10:22:37 lyta kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Feb 26 10:22:37 lyta kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Feb 26 10:22:37 lyta kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Feb 26 10:22:37 lyta kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 26 10:22:45 lyta kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 26 10:22:53 lyta kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 26 10:23:01 lyta kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 26 10:23:09 lyta kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang

Comment 5 Peter F. Patel-Schneider 2018-02-28 17:45:39 UTC

I'm seeing the same symptoms on a Lenovo Yoga 920 with Intel Corporation UHD Graphics 620 (rev 07) running Fedora 27 XFCE with OpenGL version 3.0 Mesa 17.3.5.   The problem started recently, under kernel 4.15, but persisted when I backed off to kernel 4.14.18.

The trigger is a certain kind of action in Emacs.  I looked at the Emacs code in question, but I don't see what could be causing problems there.  Is see
/sys/devices/pci0000:00/0000:00:02.0/drm/card0/error but it is zero length.

https://bugzilla.redhat.com/show_bug.cgi?id=1549219 has my report on the crash as well as a recipe for triggering it.

Comment 6 Mark Janes 2018-02-28 21:55:22 UTC

This is fixed in mesa 17.3.6, which was just released to address this specific bug.

Comment 7 Peter F. Patel-Schneider 2018-03-01 04:16:08 UTC

Upgrading to mesa 17.3.6 fixed the problem for me.

Comment 8 Elizabeth 2018-03-14 23:07:43 UTC

Thanks for your time, closing this issue then.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.