111854 – i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Bug 111854 - i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Summary: i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Status:	RESOLVED MOVED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	XOrg git
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium major
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2019-09-29 18:44 UTC by mailinglists35
Modified:	2019-11-29 19:35 UTC (History)
CC List:	4 users (show)

See Also:
i915 platform:	BXT
i915 features:	GPU hang

Attachments
/sys/class/drm/card0/error (15.74 KB, text/plain) 2019-09-29 18:44 UTC, mailinglists35	no flags	Details
dmesg (74.22 KB, text/plain) 2019-09-29 18:45 UTC, mailinglists35	no flags	Details
/sys/class/drm/card0/error (16.51 KB, text/plain) 2019-10-11 10:17 UTC, Andrew Mayorov	no flags	Details
View All

Description mailinglists35 2019-09-29 18:44:01 UTC

Created attachment 145579 [details]
/sys/class/drm/card0/error

sep 28 20:55:04 e403na kernel: i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0
sep 28 20:55:04 e403na kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
sep 28 20:55:04 e403na kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
sep 28 20:55:04 e403na kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
sep 28 20:55:04 e403na kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
sep 28 20:55:04 e403na kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
sep 28 20:55:04 e403na kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0



sudo cat /sys/class/drm/card0/error > dump.txt

dmesg  > dmesg.txt

Comment 1 mailinglists35 2019-09-29 18:45:08 UTC

Created attachment 145580 [details]
dmesg

Comment 2 mailinglists35 2019-09-29 18:46:27 UTC

fedora core 31 amd64 on asus vivobook e403na

Comment 3 mailinglists35 2019-09-29 18:52:26 UTC

> Other attachments if relevant:
> screenshot or photo (a picture is worth a thousand words);
> output of "xrandr --verbose" for display mode issue;
> intel_reg_dumper output (see the guide) and VBIOS dump (see the guide) for
> display issues;
> for GPU hang, get the last batch buffer (see the guide);
> for suspend/resume problems, refer to the guide.

i have no idea what is relevant. i did not experience any visible issue except just noticed this message on my dmesg. usage pattern is an alternating series of suspending/resume+(reboots when new kernels) with no external monitor.

Comment 4 Francesco Balestrieri 2019-10-10 06:16:12 UTC

Thanks for reporting. Given that there is no visible issue I'm setting the severity to minor, please update if you notice any other symptom.

Comment 5 Andrew Mayorov 2019-10-11 10:17:38 UTC

Created attachment 145707 [details]
/sys/class/drm/card0/error

This issue does affects me as well. I'm attaching a crash dump kernel spewed out on my system.

The system is Lenovo Thinkpad T480 20L5000BRT.

❯ uname -a
Linux kenfawks 5.3.4-arch1-1-ARCH #1 SMP PREEMPT Sat Oct 5 13:44:11 UTC 2019 x86_64 GNU/Linux

❯ cat /proc/cpuinfo | head -n7
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 142
model name	: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
stepping	: 10
microcode	: 0xb4

Comment 6 Andrew Mayorov 2019-10-11 10:21:26 UTC

I have experienced this at least three times already as of writing this comment.

The system becomes totally unresposive for 5 seconds or so, then kernel writes out aforementioned log snippet and everything continues to function as it was. No userspace crashes.

Comment 7 Nikita Bobko 2019-11-04 13:30:34 UTC

Same happened for me. It's not minor bug because system becoming completely unresponsive for a few seconds.

Comment 8 Nikita Bobko 2019-11-04 19:25:08 UTC

Happened with me today two times in a sequence. Logs from `journalctl`:
```
Nov 04 22:18:53 ux330ca kernel: i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0
Nov 04 22:18:53 ux330ca kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Nov 04 22:18:53 ux330ca kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Nov 04 22:18:53 ux330ca kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Nov 04 22:18:53 ux330ca kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Nov 04 22:18:53 ux330ca kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
```
gpu crash dump: https://gist.github.com/nikitabobko/2de203963d2fd11f1745edb48c76fd26

Comment 9 Nikita Bobko 2019-11-04 21:01:50 UTC

It seems that WebGL is broken. I don't know whether these two issues are related but I remember that everything was ok with WebGL on my computer. I noticed that WebGL doesn't work after noticed this GPU glitches and gpu crash dumps. For example such site: https://alteredqualia.com/three/examples/webgl_pasta.html running on top of WebGL shows me:
```
Your graphics card does not seem to support WebGL.
Find out how to get it here.
```
On the same machine running Windows this site works perfectly. Guys who also have this gpu crash issue please confirm or refute that WebGL is broken for you. Thanks!

Comment 10 Nikita Bobko 2019-11-04 21:41:03 UTC

Ah, ok WebGL seems Chrome bug. Don't have such problem in Firefox. Sorry for wrong concern

Comment 11 Francesco Balestrieri 2019-11-05 14:54:15 UTC

> The system becomes totally unresposive for 5 seconds or so

Raising to major severity

Comment 12 Martin Peres 2019-11-29 19:35:57 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/462.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.