92688 – [BDW] GPU HANG: ecode 8:0:0x86dffffd, in Xorg [2425], reason: Ring hung, action: reset

Bug 92688 - [BDW] GPU HANG: ecode 8:0:0x86dffffd, in Xorg [2425], reason: Ring hung, action: reset

Summary: [BDW] GPU HANG: ecode 8:0:0x86dffffd, in Xorg [2425], reason: Ring hung, acti...

Status:	RESOLVED INVALID

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/i965 (show other bugs)
Version:	12.0
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Samuel Thibault
QA Contact:	Intel 3D Bugs Mailing List

URL:
Whiteboard:
Keywords:

Duplicates (1):	95019 (view as bug list)
Depends on:
Blocks:

Reported:	2015-10-27 09:49 UTC by Samuel Thibault
Modified:	2017-02-12 01:04 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:	BDW
i915 features:	GPU hang

Attachments
dmesg (67.49 KB, text/plain) 2015-10-27 09:49 UTC, Samuel Thibault	Details
lspci output (5.32 KB, text/plain) 2015-10-27 09:50 UTC, Samuel Thibault	Details
gpu crash dump from /sys/class/drm/card0/error (209.52 KB, text/plain) 2015-10-27 09:51 UTC, Samuel Thibault	Details
Xorg log of the second crash (42.68 KB, text/plain) 2015-10-27 09:59 UTC, Samuel Thibault	Details
new dmesg (1.19 KB, text/plain) 2016-11-06 16:36 UTC, Samuel Thibault	Details
new gpu crash dump (274.06 KB, text/plain) 2016-11-06 16:37 UTC, Samuel Thibault	Details
View All

Description Samuel Thibault 2015-10-27 09:49:47 UTC

Created attachment 119217 [details]
dmesg

Hello,

On my new HP elitebook 820 laptop, I keep getting GPU hangs (like 2-4 times a day, got one again while filing this very bug report...). Using various kernel versions (4.1.0, 4.2.[0-4]) I get various effects (frozen normal display, frozen black display, black display flicker then back to unfrozen normal, Xorg crash). Here is various information about the box and the crash

Comment 1 Samuel Thibault 2015-10-27 09:50:03 UTC

Created attachment 119218 [details]
lspci output

Comment 2 Samuel Thibault 2015-10-27 09:51:14 UTC

Created attachment 119219 [details]
gpu crash dump  from /sys/class/drm/card0/error

Comment 3 Samuel Thibault 2015-10-27 09:52:30 UTC

In dmesg, time 544-554 corresponds to the first crash of the day, and time 1382-1400 corresponds to the second crash.

Comment 4 Samuel Thibault 2015-10-27 09:59:51 UTC

Created attachment 119220 [details]
Xorg log of the second crash

Comment 5 Samuel Thibault 2015-10-27 10:04:09 UTC

And here are the debian versions I'm using (sorry for providing information bit by bit like that, but now I'm afraid of getting yet another crash while reporting...):

xserver-xorg-video-intel                 2:2.99.917-2
xserver-xorg-core                   2:1.17.2-3

Xorg is running as normal user.

I'm installing the debugging version of the driver for better backtrace in Xorg.log.

Comment 6 yann 2016-05-20 08:41:47 UTC

*** Bug 95019 has been marked as a duplicate of this bug. ***

Comment 7 yann 2016-09-21 14:48:31 UTC

There were improvements pushed in kernel and Mesa that will benefit to your system, so please re-test with latest kernel & Mesa to see if this issue is still occurring.

In the meantime, assigning to Mesa product.

Kernel: 4.2.4
Platform: Broadwell-U (pci id: 0x1616)
Mesa: [Please confirm your mesa version]

From this error dump, hung is happening in render ring batch with active head at 0x0a786b2c, with 0x79000002 (3DSTATE_DRAWING_RECTANGLE) as IPEHR.

Batch extract (around 0x0a786b2c):

0x0a786b0c:      0x7b000005: 3DPRIMITIVE: fail sequential
0x0a786b10:      0x00000000:    vertex count
0x0a786b14:      0x0000011d:    start vertex
0x0a786b18:      0x00001946:    instance count
0x0a786b1c:      0x00000001:    start instance
0x0a786b20:      0x00000000:    index bias
0x0a786b24:      0x00000000: MI_NOOP
0x0a786b28:      0x79000002: 3DSTATE_DRAWING_RECTANGLE
0x0a786b2c:      0x00000000:    top left: 0,0
0x0a786b30:      0x039f04fd:    bottom right: 1277,927
0x0a786b34:      0x00000000:    origin: 0,0
Bad count in PIPE_CONTROL
0x0a786b38:      0x7a000004: PIPE_CONTROL: no write, no depth stall, no RC write flush, no inst flush
0x0a786b3c:      0x00101400:    destination address
0x0a786b40:      0x00000000:    immediate dword low
0x0a786b44:      0x00000000:    immediate dword high
0x0a786b50:      0x784d0000: 3D UNKNOWN: 3d_965 opcode = 0x784d

Comment 8 Matt Turner 2016-11-04 00:45:27 UTC

Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.

Comment 9 Samuel Thibault 2016-11-06 16:36:27 UTC

Created attachment 127799 [details]
new dmesg

Comment 10 Samuel Thibault 2016-11-06 16:37:33 UTC

Created attachment 127800 [details]
new gpu crash dump

Comment 11 Samuel Thibault 2016-11-06 16:38:56 UTC

Yes, I'm still experimenting the crash with:

- MESA 12.0.3-3
- 2:1.18.4-2
- linux 4.8.0

Comment 12 Mark Janes 2016-12-07 17:57:42 UTC

Samuel,  this is probably caused by xf86-video-intel.  Can you reproduce it with the modesetting DDX?

/etc/X11/xorg.conf.d/20-modesetting.conf

Section "Device"
    Identifier  "Intel Graphics"
    Driver      "modesetting"
    Option      "AccelMethod"    "glamor"
    Option      "DRI"            "3"
EndSection

Comment 13 Samuel Thibault 2016-12-07 18:18:23 UTC

Oh, I hadn't noticed that it was using the intel driver, that's odd.  Now it's really using modesetting. I have reenabled default options which gets glamor acceleration enabled and will test for some time and report back.

Comment 14 Annie 2017-02-10 22:32:44 UTC

Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.

Comment 15 Samuel Thibault 2017-02-12 01:04:51 UTC

Well, yes, I know...

The problem is that for these past two months, whenever I try to re-enable glamor acceleration, I get a hard freeze within a few hours work, with no information in dmesg or Xorg.0.log.

At this point I have pretty much given up on enabling acceleration.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.