112116 – GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Bug 112116 - GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Summary: GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Status:	RESOLVED MOVED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	low normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:	Triaged, ReadyForDev
Keywords:

Depends on:
Blocks:

Reported:	2019-10-23 20:11 UTC by H. Lekin
Modified:	2019-11-29 19:43 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:	KBL
i915 features:	GPU hang

Attachments
/sys/class/drm/card0/error (9.91 KB, text/plain) 2019-10-23 20:11 UTC, H. Lekin	no flags	Details
call trace (5.05 KB, text/plain) 2019-10-23 20:13 UTC, H. Lekin	no flags	Details
dmesg incl. drm.debug (3.42 MB, text/plain) 2019-10-24 19:59 UTC, H. Lekin	no flags	Details
GPU crash dump (16.45 KB, text/plain) 2019-11-27 07:47 UTC, varnie29a	no flags	Details
View All

Description H. Lekin 2019-10-23 20:11:42 UTC

Created attachment 145802 [details]
/sys/class/drm/card0/error

Hello there,

found this GPU HANG in dmesg with the request to file a report.

Journal states, that it occured once before:
Sep 28 06:36:47 <hostname> kernel: i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0

Going to attach the crash dump (/sys/class/drm/card0/error), as well as a call trace that appears about once a week since the end of September.

Also, such Atomic update failures are logged about 5-7 times a day:
[Di Okt 22 07:53:58 2019] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=3023337 end=3023338) time 126 us, min 1073, max 1079, scanline start 1072, end 1080

I will provide any other information you might need.

By the way, thanks for all work you are doing.

Greets,
H.L

Comment 1 H. Lekin 2019-10-23 20:13:12 UTC

Created attachment 145803 [details]
call trace

Comment 2 Lakshmi 2019-10-24 07:59:03 UTC

(In reply to H. Lekin from comment #1)
> Created attachment 145803 [details]
> call trace

Can you please attach the full dmesg log with kernel parameters drm.debug=0x1e log_buf_len=4M ? What is the reproduction rate of GPU hang? IS there any other impact other than GPU hang messages in dmesg?

Comment 3 H. Lekin 2019-10-24 19:59:24 UTC

Created attachment 145809 [details]
dmesg incl. drm.debug

Comment 4 H. Lekin 2019-10-24 20:01:33 UTC

Full dmesg log with kernel parameters drm.debug=0x1e
log_buf_len=4M: Attached.

Reproduction rate of GPU hang: 28/09/2019, 23/10/2019

No visible impact other than GPU hang messages in dmesg.

Comment 5 Lakshmi 2019-10-25 07:35:34 UTC

(In reply to H. Lekin from comment #4)
> Full dmesg log with kernel parameters drm.debug=0x1e
> log_buf_len=4M: Attached.
> 
> Reproduction rate of GPU hang: 28/09/2019, 23/10/2019
> 
> No visible impact other than GPU hang messages in dmesg.

Thanks for the info.

Setting the priority and severity based on the impact.

Comment 6 Jani Saarinen 2019-11-26 15:53:13 UTC

You are reporter of the issue currently having low priority. Do you still see issue. If so, please spesify clearly what is impact to you.

Comment 7 varnie29a 2019-11-27 07:42:13 UTC

Hi. I experience the same bug.

Comment 8 varnie29a 2019-11-27 07:47:41 UTC

Created attachment 146029 [details]
GPU crash dump

I've got a bug running my Manjaro Linux on Lenovo ThinkPad T480 (BIOS 1.28 version). 
Here's a relevant part of dmesg:
""
[ 1531.905682] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0
[ 1531.905684] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1531.905684] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1531.905685] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1531.905685] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1531.905686] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1531.906696] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
""

The GPU crash dump attached (named GPU_crash_dump_27Nov2019).

My specs:
uname -a
""Linux heimdal 5.3.12-1-MANJARO #1 SMP PREEMPT Thu Nov 21 10:55:53 UTC 2019 x86_64 GNU/Linux""

CPU: Intel i5-8250U (8) @ 3.400GHz 
GPU: Intel UHD Graphics 620 

Please ask what information you need and I'll provide it.
Thank you.

Comment 9 Lakshmi 2019-11-28 12:06:37 UTC

(In reply to varnie29a from comment #8)
> Created attachment 146029 [details]
> GPU crash dump
> 
> I've got a bug running my Manjaro Linux on Lenovo ThinkPad T480 (BIOS 1.28
> version). 
> Here's a relevant part of dmesg:
> ""
> [ 1531.905682] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on
> rcs0
> [ 1531.905684] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
> stack, including userspace.
> [ 1531.905684] [drm] Please file a _new_ bug report on bugs.freedesktop.org
> against DRI -> DRM/Intel
> [ 1531.905685] [drm] drm/i915 developers can then reassign to the right
> component if it's not a kernel issue.
> [ 1531.905685] [drm] The gpu crash dump is required to analyze gpu hangs, so
> please always attach it.
> [ 1531.905686] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [ 1531.906696] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
> ""
> 
> The GPU crash dump attached (named GPU_crash_dump_27Nov2019).
> 
> My specs:
> uname -a
> ""Linux heimdal 5.3.12-1-MANJARO #1 SMP PREEMPT Thu Nov 21 10:55:53 UTC 2019
> x86_64 GNU/Linux""
> 
> CPU: Intel i5-8250U (8) @ 3.400GHz 
> GPU: Intel UHD Graphics 620 
> 
> Please ask what information you need and I'll provide it.
> Thank you.
Can you try to reproduce this issue using drm-tip ?(https://cgit.freedesktop.org/drm-tip)

Comment 10 Martin Peres 2019-11-29 19:43:53 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/548.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.