Bug 97252

Summary:	[HSW] GPU HANG: ecode 7:0:0x85dffffc, in chrome [3388], reason: Ring hung, action: reset
Product:	Mesa	Reporter:	Zbigniew Czapiga <bzb>
Component:	Drivers/DRI/i965	Assignee:	Zbigniew Czapiga <bzb>
Status:	RESOLVED MOVED	QA Contact:
Severity:	normal
Priority:	medium	CC:	intel-gfx-bugs
Version:	12.0
Hardware:	x86-64 (AMD64)
OS:	Linux (All)
Whiteboard:
i915 platform:	HSW	i915 features:	GPU hang
Attachments:	system log and gpu crash dump Mesa 13 crashlog and system log crash log after switching to modesetting driver

Description Zbigniew Czapiga 2016-08-09 08:30:17 UTC

Created attachment 125625 [details]
system log and gpu crash dump

From time to time when I start some WebGL application in chrome, I get this error:

[64414.495102] [drm] stuck on render ring
[64414.496855] [drm] GPU HANG: ecode 7:0:0x85dffffc, in chrome [3388], reason: Ring hung, action: reset
[64414.496859] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[64414.496862] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[64414.496864] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[64414.496866] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[64414.496869] [drm] GPU crash dump saved to /sys/class/drm/card0/error


I get this error on both Mesa 11.2 (ubuntu 16.04) and 12.1 (updated from https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers)

GPU crash dump is attached.

Comment 1 yann 2016-08-30 12:14:10 UTC

Assigning to Mesa product (please let me know if I am mistaken with this GPU Hang).

From this error dump, hung is happening in render ring batch with active head
at 0x0eb81010, with 0x7a000003 (PIPE_CONTROL) as IPEHR (previously there was another 3DPRIMITIVE followed by a PIPE_CONTROL).

Batch extract (around 0x0eb81010):

0x0eb80fe0:      0x7b000005: 3DPRIMITIVE:
0x0eb80fe4:      0x0000000f:    rect list sequential
0x0eb80fe8:      0x00000003:    vertex count
0x0eb80fec:      0x00000000:    start vertex
0x0eb80ff0:      0x00000001:    instance count
0x0eb80ff4:      0x00000000:    start instance
0x0eb80ff8:      0x00000000:    index bias
0x0eb80ffc:      0x7a000003: PIPE_CONTROL
0x0eb81000:      0x00101001:    no write, cs stall, render target cache flush, depth cache flush,
0x0eb81004:      0x00000000:    destination address
0x0eb81008:      0x00000000:    immediate dword low
0x0eb8100c:      0x00000000:    immediate dword high
0x0eb81010:      0x7a000003: PIPE_CONTROL
0x0eb81014:      0x00000c10:    no write, instruction cache invalidate, texture cache invalidate, vf fetch invalidate,
0x0eb81018:      0x00000000:    destination address
0x0eb8101c:      0x00000000:    immediate dword low
0x0eb81020:      0x00000000:    immediate dword high

Comment 2 yann 2016-11-04 15:01:59 UTC

Please test a new version of Mesa (13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.

Comment 3 Zbigniew Czapiga 2016-11-07 14:21:04 UTC

Created attachment 127815 [details]
Mesa 13 crashlog and system log

Comment 4 Zbigniew Czapiga 2016-11-07 14:22:49 UTC

Unfortunately mesa 13 crashes the same way as previous one.

Comment 5 Mark Janes 2016-12-07 17:51:43 UTC

Zbigniew, are you using xf86-video-intel?

If so, please verify that you can reproduce after switching X to use modesetting:

https://bbs.archlinux.org/viewtopic.php?id=211792

Comment 6 Zbigniew Czapiga 2016-12-09 15:51:27 UTC

Created attachment 128392 [details]
crash log after switching to modesetting driver

Comment 7 Zbigniew Czapiga 2016-12-09 15:52:28 UTC

Hi Mark,

Yes, I was using xf86-video-intel, but I tried modesetting as well:

1. I added modesetting to /etc/X11/xorg.conf 
Section "Device"
    Identifier  "Intel Graphics"
    Driver      "modesetting"

2. xrandr --list-providers confirmed that I was using modesetting:
Providers: number : 1
Provider 0: id: 0x45 cap: 0x9, Source Output, Sink Offload crtcs: 3 outputs: 3 associated providers: 0 name:modesetting

3. glxinfo showed DRI3
LIBGL_DEBUG=verbose glxinfo | grep libgl
libGL: Using DRI3 for screen 0

Unfortunatelly after some time driver crashed:

[Fri Dec  9 15:35:54 2016] [drm] stuck on render ring
[Fri Dec  9 15:35:54 2016] [drm] GPU HANG: ecode 7:0:0x85dffcfc, in chrome [1877], reason: Ring hung, action: reset
[Fri Dec  9 15:35:54 2016] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[Fri Dec  9 15:35:54 2016] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[Fri Dec  9 15:35:54 2016] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[Fri Dec  9 15:35:54 2016] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[Fri Dec  9 15:35:54 2016] [drm] GPU crash dump saved to /sys/class/drm/card0/error

Comment 8 Elizabeth 2018-03-21 22:45:49 UTC

Hello Zbigniew, were you able to find the fix for this issue (for example, updating/upgrading/manually building a specific driver version, etc)? Thank you.

Comment 9 GitLab Migration User 2019-09-25 18:57:34 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1532.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.