Bug 104578

Summary: [skl] GPU hang in Xorg, probably due to emacs editor.
Product: Mesa Reporter: Vitaly <vitalyo>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: critical    
Priority: highest CC: intel-gfx-bugs
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: dump /sys/class/drm/card0/error
/sys/class/drm/card0/error

Description Vitaly 2018-01-11 06:10:40 UTC
Created attachment 136659 [details]
dump /sys/class/drm/card0/error

My notebook Laptop ThinkPad T570 (Type 20JW, 20JX) (20JW0004US)

Xserver is periodically rebooted.

Into kern.log:

Jan 11 10:31:10 osipov kernel: [  716.920056] [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [517], reason: Hang on rcs0, action: reset
Jan 11 10:31:10 osipov kernel: [  716.920057] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 11 10:31:10 osipov kernel: [  716.920058] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 11 10:31:10 osipov kernel: [  716.920058] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jan 11 10:31:10 osipov kernel: [  716.920058] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jan 11 10:31:10 osipov kernel: [  716.920059] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jan 11 10:31:10 osipov kernel: [  716.920063] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 10:31:18 osipov kernel: [  724.909575] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 10:31:26 osipov kernel: [  732.909530] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 10:31:34 osipov kernel: [  740.909505] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 10:31:42 osipov kernel: [  748.909481] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Comment 1 Vitaly 2018-01-11 10:24:01 UTC
kern.log

Jan 11 15:20:21 osipov kernel: [15200.920749] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 15:20:29 osipov kernel: [15208.952658] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 15:20:37 osipov kernel: [15216.952616] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 15:20:45 osipov kernel: [15224.952634] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 11 15:20:53 osipov kernel: [15232.952586] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Comment 2 Elizabeth 2018-01-11 17:06:16 UTC
Hello Vitaly,
Is there a way to trigger this easily? Which distro and desktop environment are your using?
Thank you.
Comment 3 Vitaly 2018-01-12 03:04:53 UTC
Hello!

My distro debian(unstable) - Linux 4.14.0-3-amd64 #1 SMP Debian 4.14.12-2 (2018-01-06) x86_64 GNU/Linux
WM:
awesome v4.2 (Human after all)
 # Compiled against Lua 5.3.3 (running with Lua 5.3)
 # D-Bus support: #
 # execinfo support: #
 # xcb-randr version: 1.5
 # LGI version: 0.9.2

It noted that this occurs at the speed dial on the keyboard. Because I work in emacs, it often happens when working with it.
Comment 4 Elizabeth 2018-01-23 22:17:48 UTC
Hello again Vitaly, Could you try to find a way to reliable reproduce this, like with a heavy text filled emacs document or the kind? Thanks.
Comment 5 Vitaly 2018-01-24 09:50:21 UTC
Hello Elizabeth!

On the size of the file does not depend, they are not large less than 1 Kb, the texts are simple - the program code.
After updating the kernel to:
Linux 4.14.0-3-amd64 # 1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU / Linux
a couple of days everything was fine, but then again it was 3-4 times.
Now I try not to press the keys quickly :(
I'm using emacs25 with GTK +.
Thanks.
Comment 6 Vitaly 2018-01-24 11:01:10 UTC
Replaced emacs25 with GTK+ by emacs25-lucid.
So far so good, if there are problems, I'll let you know.
Thanks.
Comment 7 Vitaly 2018-01-24 11:17:55 UTC
(In reply to Vitaly from comment #6)
> Replaced emacs25 with GTK+ by emacs25-lucid.
> So far so good, if there are problems, I'll let you know.
> Thanks.

Error is present :(
Comment 8 Mark Janes 2018-01-24 14:54:51 UTC
Mesa devs have resolved a class of GPU hangs with this series:

https://patchwork.freedesktop.org/series/37023/

If you can compile mesa with patches, please test the series for your use case.  Otherwise we can wait until the patch is merged.
Comment 9 Kenneth Graunke 2018-01-25 08:29:13 UTC
Those patches have been merged.  Can you try Mesa master?  You'll need to be sure that X picks up the new Mesa, which usually means replacing your system mesa and restarting X.
Comment 10 Vitaly 2018-01-25 08:54:32 UTC
What is a Mesa master
Comment 11 Elizabeth 2018-01-25 16:19:00 UTC
(In reply to Vitaly from comment #10)
> What is a Mesa master
https://cgit.freedesktop.org/mesa/mesa/ master branch.
Comment 12 Mark Janes 2018-01-25 16:39:24 UTC
If you are not familiar with building Mesa for your system, it's probably best to wait for the Mesa 18.0 release and test with that.
Comment 13 Vitaly 2018-01-26 04:51:32 UTC
OK
I will be wait ver. 18.
Now i have ver. 17.3.3
Comment 14 Vitaly 2018-02-08 06:31:56 UTC
Created attachment 137228 [details]
/sys/class/drm/card0/error

I install packages from debian(experimental):
libgl1-mesa-dri_18.0.0_rc2-1_amd64.deb
libglapi-mesa_18.0.0_rc2-1_amd64.deb
libglx-mesa0_18.0.0_rc2-1_amd64.deb

glxinfo:
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.0.0-rc2
OpenGL version string: 3.0 Mesa 18.0.0-rc2
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 18.0.0-rc2

dump from /sys/class/drm/card0/error in attachment
Comment 15 Vitaly 2018-02-08 06:36:43 UTC
Feb  8 11:12:31 osipov kernel: [37267.631978] [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [561], reason: Hang on rcs0, action: reset
Feb  8 11:12:31 osipov kernel: [37267.631981] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Feb  8 11:12:31 osipov kernel: [37267.631983] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Feb  8 11:12:31 osipov kernel: [37267.631984] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Feb  8 11:12:31 osipov kernel: [37267.631985] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Feb  8 11:12:31 osipov kernel: [37267.631987] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Comment 16 Vitaly 2018-02-08 07:02:06 UTC
My card:
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07)
Comment 17 Mark Janes 2018-02-08 21:35:05 UTC
The known hang is fixed in mesa 18.0rc3

Unfortunately, this is not available in debian experimental yet :(
Comment 18 Vitaly 2018-02-22 13:00:40 UTC
On Monday I install packages from debian(experimental):
libdrm-intel1_2.4.89-1_amd64.deb
libegl-mesa0_18.0.0_rc4-1_amd64.deb
libgbm1_18.0.0_rc4-1_amd64.deb
libgl1-mesa-dri_18.0.0_rc4-1_amd64.deb
libglapi-mesa_18.0.0_rc4-1_amd64.deb
libglx-mesa0_18.0.0_rc4-1_amd64.deb
libxcb-dri2-0_1.12-1_amd64.deb
libxcb-glx0_1.12-1_amd64.deb
mesa-va-drivers_18.0.0_rc4-1_amd64.deb
mesa-vdpau-drivers_18.0.0_rc4-1_amd64.deb


While there are no errors
Comment 19 GitLab Migration User 2019-09-25 19:07:16 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1672.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.