Bug 94002 - [SKL] GPU HANG: ecode 9:0:0x85dfbfff, in firefox [881], reason: Ring hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x85dfbfff, in firefox [881], reason: Ring hung, ac...
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 11.1
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-04 12:46 UTC by Yuval Adam
Modified: 2017-02-10 22:39 UTC (History)
6 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
bzip'd /sys/class/drm/card0/error (46.51 KB, application/x-bzip2)
2016-02-04 12:46 UTC, Yuval Adam
Details

Description Yuval Adam 2016-02-04 12:46:40 UTC
Created attachment 121518 [details]
bzip'd /sys/class/drm/card0/error

On a Skylake system (i5-6260U, Intel NUC6i5SYH) I'm getting multiple GPU crashes up to the point of complete system freeze. This reproduces easily when opening a browser with WebGL-intensive processes, and happens within 10-20 seconds.

Running x86_64 Arch Linux, kernel 4.4.1. Tested with 4.5-rc2 and same problem occurs there as well.

Error log attached.
Comment 1 Yuval Adam 2016-02-04 12:47:51 UTC
Currently I'm using this workaround which disables most of the hardware acceleration

$ cat /etc/X11/xorg.conf.d/20-intel.conf 
Section "Device"
	Identifier  "Intel Graphics"
	Driver      "intel"
	Option	    "DRI"	"false"
EndSection
Comment 2 Yuval Adam 2016-02-08 22:28:34 UTC
Booting with kernel param i915.enable_rc6=0 seems to workaround this bug.
Comment 3 Yuval Adam 2016-02-14 16:43:04 UTC
Some relevant details from my current Xorg.0.log (with RC6 DISABLED! - I'm currently booted with the workaround mentioned earlier):

$ cat ~/.local/share/xorg/Xorg.0.log
X.Org X Server 1.18.0
Release Date: 2015-11-09
[    19.987] X Protocol Version 11, Revision 0
[    19.987] Build Operating System: Linux 4.2.5-1-ARCH x86_64 
[    19.987] Current Operating System: Linux hostname 4.4.1-2-ARCH #1 SMP PREEMPT Wed Feb 3 13:12:33 UTC 2016 x86_64
[    19.987] Kernel command line: initrd=xxx root=xxx i915.enable_rc6=0
[    19.988] Build Date: 08 January 2016  05:56:16PM
[    19.988]  
[    19.988] Current version of pixman: 0.34.0
[    19.988]    Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
[    19.988] Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[    19.988] (==) Log file: "~/.local/share/xorg/Xorg.0.log", Time: Mon Feb  8 23:11:38 2016
[    19.989] (==) Using config directory: "/etc/X11/xorg.conf.d"
[    19.989] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[    19.990] (==) No Layout section.  Using the first Screen section.
[    19.990] (==) No screen section available. Using defaults.
[    19.991] (**) |-->Screen "Default Screen Section" (0)
[    19.991] (**) |   |-->Monitor "<default monitor>"
[    19.991] (==) No device specified for screen "Default Screen Section".
        Using the first device section listed.
[    19.991] (**) |   |-->Device "Intel Graphics"
[    19.991] (==) No monitor specified for screen "Default Screen Section".
        Using a default monitor configuration.
[    19.991] (==) Automatically adding devices
[    19.991] (==) Automatically enabling devices
[    19.991] (==) Automatically adding GPU devices
[    19.991] (==) Max clients allowed: 256, resource mask: 0x1fffff
[    19.993] (==) FontPath set to:
        /usr/share/fonts/misc/,
        /usr/share/fonts/TTF/,
        /usr/share/fonts/OTF/,
        /usr/share/fonts/Type1/,
        /usr/share/fonts/100dpi/,
        /usr/share/fonts/75dpi/
[    19.993] (==) ModulePath set to "/usr/lib/xorg/modules"
[    19.993] (II) The server relies on udev to provide the list of input devices.
        If no devices become available, reconfigure udev or disable AutoAddDevices.
[    19.993] (II) Loader magic: 0x819d40
[    19.993] (II) Module ABI versions:
[    19.993]    X.Org ANSI C Emulation: 0.4
[    19.993]    X.Org Video Driver: 20.0
[    19.993]    X.Org XInput driver : 22.1
[    19.993]    X.Org Server Extension : 9.0
[    19.994] (++) using VT number 1

[    19.994] (--) controlling tty is VT number 1, auto-enabling KeepTty
[    19.995] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c1
[    19.995] (II) xfree86: Adding drm device (/dev/dri/card0)
[    19.996] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 8 paused 0
[    20.218] (--) PCI:*(0:0:2:0) 8086:1926:8086:2063 rev 10, Mem @ 0xde000000/16777216, 0xc0000000/268435456, I/O @ 0x0000f000/64
[    20.218] (WW) Open ACPI failed (/var/run/acpid.socket) (No such file or directory)
[    20.218] (II) LoadModule: "glx"
[    20.220] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[    20.237] (II) Module glx: vendor="X.Org Foundation"
[    20.238]    compiled for 1.18.0, module version = 1.0.0
[    20.238]    ABI class: X.Org Server Extension, version 9.0
[    20.238] (==) AIGLX enabled
[    20.238] (II) LoadModule: "intel"
[    20.238] (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so
[    20.242] (II) Module intel: vendor="X.Org Foundation"
[    20.242]    compiled for 1.18.0, module version = 2.99.917
[    20.242]    Module class: X.Org Video Driver
[    20.242]    ABI class: X.Org Video Driver, version 20.0
[    20.242] (II) intel: Driver for Intel(R) Integrated Graphics Chipsets:
        i810, i810-dc100, i810e, i815, i830M, 845G, 854, 852GM/855GM, 865G,
        915G, E7221 (i915), 915GM, 945G, 945GM, 945GME, Pineview GM,
        Pineview G, 965G, G35, 965Q, 946GZ, 965GM, 965GME/GLE, G33, Q35, Q33,
        GM45, 4 Series, G45/G43, Q45/Q43, G41, B43
[    20.243] (II) intel: Driver for Intel(R) HD Graphics: 2000-6000
[    20.243] (II) intel: Driver for Intel(R) Iris(TM) Graphics: 5100, 6100
[    20.243] (II) intel: Driver for Intel(R) Iris(TM) Pro Graphics: 5200, 6200, P6300
[    20.243] xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
[    20.244] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20151010
[    20.244] (II) intel(0): SNA compiled from 2.99.917-519-g8229390
[    20.246] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[    20.247] (--) intel(0): gen9 engineering sample
[    20.247] (--) intel(0): CPU: x86-64, sse2, sse3, ssse3, sse4.1, sse4.2, avx, avx2; using a maximum of 2 threads
[    20.247] (II) intel(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
[    20.247] (==) intel(0): Depth 24, (--) framebuffer bpp 32
[    20.247] (==) intel(0): RGB weight 888
[    20.248] (==) intel(0): Default visual is TrueColor
[    20.248] (**) intel(0): Option "TearFree" "true"
[    20.249] (II) intel(0): Output HDMI1 has no monitor section
[    20.249] (II) intel(0): Enabled output HDMI1
[    20.249] (II) intel(0): Output DP1 has no monitor section
[    20.249] (II) intel(0): Enabled output DP1
[    20.249] (II) intel(0): Output HDMI2 has no monitor section
[    20.249] (II) intel(0): Enabled output HDMI2
[    20.249] (--) intel(0): Using a maximum size of 256x256 for hardware cursors
[    20.249] (II) intel(0): Output VIRTUAL1 has no monitor section
[    20.249] (II) intel(0): Enabled output VIRTUAL1
[    20.249] (--) intel(0): Output HDMI1 using initial mode 2560x1440 on pipe 0
[    20.249] (**) intel(0): TearFree enabled
[    20.249] (==) intel(0): DPI set to (96, 96)
[    20.250] (II) Loading sub module "dri2"
[    20.250] (II) LoadModule: "dri2"
[    20.250] (II) Module "dri2" already built-in
[    20.250] (II) Loading sub module "present"
[    20.250] (II) LoadModule: "present"
[    20.250] (II) Module "present" already built-in
[    20.250] (==) Depth 24 pixmap format is 32 bpp
[    20.253] (II) intel(0): SNA initialized with generic backend
[    20.253] (==) intel(0): Backing store enabled
[    20.253] (==) intel(0): Silken mouse enabled
[    20.253] (II) intel(0): HW Cursor enabled
[    20.253] (II) intel(0): RandR 1.2 enabled, ignore the following RandR disabled message.
[    20.255] (==) intel(0): DPMS enabled
[    20.255] (==) intel(0): Display hotplug detection enabled
[    20.255] (II) intel(0): Textured video not supported on this hardware or backend
[    20.256] (II) intel(0): [DRI2] Setup complete
[    20.256] (II) intel(0): [DRI2]   DRI driver: i965
[    20.256] (II) intel(0): [DRI2]   VDPAU driver: va_gl
[    20.256] (II) intel(0): direct rendering: DRI2 enabled
[    20.256] (II) intel(0): hardware support for Present enabled
[    20.256] (--) RandR disabled
[    20.330] (II) AIGLX: enabled GLX_MESA_copy_sub_buffer
[    20.330] (II) AIGLX: enabled GLX_ARB_create_context
[    20.330] (II) AIGLX: enabled GLX_ARB_create_context_profile
[    20.330] (II) AIGLX: enabled GLX_EXT_create_context_es2_profile
[    20.330] (II) AIGLX: enabled GLX_INTEL_swap_event
[    20.330] (II) AIGLX: enabled GLX_SGI_swap_control and GLX_MESA_swap_control
[    20.330] (II) AIGLX: enabled GLX_EXT_framebuffer_sRGB
[    20.330] (II) AIGLX: enabled GLX_ARB_fbconfig_float
[    20.330] (II) AIGLX: GLX_EXT_texture_from_pixmap backed by buffer objects
[    20.330] (II) AIGLX: enabled GLX_ARB_create_context_robustness
[    20.331] (II) AIGLX: Loaded and initialized i965
[    20.331] (II) GLX: Initialized DRI2 GL provider for screen 0
[    20.338] (II) intel(0): switch to mode 2560x1440@60.0 on HDMI1 using pipe 0, position (0, 0), rotation normal, reflection none
[    20.338] (II) intel(0): Setting screen physical size to 677 x 381
[    20.378] (II) config/udev: Adding input device Power Button (/dev/input/event2)
[    20.379] (**) Power Button: Applying InputClass "evdev keyboard catchall"
[    20.379] (**) Power Button: Applying InputClass "libinput keyboard catcha
Comment 4 Yuval Adam 2016-02-14 16:55:16 UTC
Also attaching my current mesa version:

$ pacman -Q mesa
mesa 11.1.2-1
Comment 5 Ben Widawsky 2016-02-14 17:16:05 UTC
INSTDONE says VS is hung, but the EUs are all done. Would be nice if we had that patch I wrote to dump all the INSTDONE bits for multislice parts, as this is a GT3.... Imre?

The batch contents leading up to the hang looks okay to me. It's not even doing anything complicated, just a single rectangle (rectlist).
Comment 6 Fabrice Crohas 2016-02-27 17:20:22 UTC
Same issue for me with exactly the same device on a linux kernel 4.4.1.

The workaround works well doing i915.enable_rc6=0 on kernel param do the job.

The xorg.conf is not needed.
Comment 7 ggg 2016-05-11 03:57:30 UTC
I'm see the same thing when trying run chrome (with "HW acceleration enabled" and openarena:
[265964.933851] [drm] stuck on render ring
[265964.934358] [drm] GPU HANG: ecode 9:0:0x85dfbfff, in ioquake3 [13145], reason: Ring hung, action: reset
[265964.936862] drm/i915: Resetting chip after gpu hang
[265966.933810] [drm] RC6 on
[265978.932434] [drm] stuck on render ring
[265978.933048] [drm] GPU HANG: ecode 9:0:0x85dfbfff, in ioquake3 [13145], reason: Ring hung, action: reset
[265978.935527] drm/i915: Resetting chip after gpu hang
[265980.920236] [drm] RC6 on
[265986.931547] [drm] stuck on render ring
[265986.932296] [drm] GPU HANG: ecode 9:0:0x85dfffff, in ioquake3 [13145], reason: Ring hung, action: reset
[265986.934802] drm/i915: Resetting chip after gpu hang
[265988.919528] [drm] RC6 on

I have tried with RC6 off but ISTR that also failing (but could have been slightly different failure. I to report back with next updated.
Comment 8 monochromec 2016-05-25 19:25:36 UTC
Also happens on a stock Ubuntu 16.04 running E17:

kernel: [drm] GPU HANG: ecode 6:0:0xb6c81bfc, in enlightenment [2513], reason: Ring hung, action: reset

Steps to reproduce:

1) Suspend machine
2) Wake up again
3) X11 crashes and takes E17 with it

For further details cf. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1585765
Comment 9 yann 2016-09-13 10:41:45 UTC
There were workarounds on SKL pushed in kernel, so please re-test with latest kernel to see if it has some benefits on that work.

In parallel, assigning to Mesa product (please let me know if I am mistaken with this GPU Hang).

From this error dump, hung is happening in render ring batch with active head at 0xfff5d9cc, with 0x7a000004 (PIPE_CONTROL) as IPEHR.

Batch extract (around 0xfff5d9cc):

Bad length 7 in (null), expected 6-6
0xfff5d980:      0x7b000005: 3DPRIMITIVE: fail sequential
0xfff5d984:      0x0000000f:    vertex count
0xfff5d988:      0x00000003:    start vertex
0xfff5d98c:      0x00000000:    instance count
0xfff5d990:      0x00000001:    start instance
0xfff5d994:      0x00000000:    index bias
0xfff5d998:      0x00000000: MI_NOOP
Bad count in PIPE_CONTROL
0xfff5d99c:      0x7a000004: PIPE_CONTROL: no write, no depth stall, no RC write flush, no inst flush
0xfff5d9a0:      0x00000000:    destination address
0xfff5d9a4:      0x00000000:    immediate dword low
0xfff5d9a8:      0x00000000:    immediate dword high
Bad count in PIPE_CONTROL
0xfff5d9b4:      0x7a000004: PIPE_CONTROL: no write, no depth stall, no RC write flush, no inst flush
0xfff5d9b8:      0x00101c11:    destination address
0xfff5d9bc:      0x00000000:    immediate dword low
0xfff5d9c0:      0x00000000:    immediate dword high
Bad count in PIPE_CONTROL
0xfff5d9cc:      0x7a000004: PIPE_CONTROL: no write, no depth stall, no RC write flush, no inst flush
0xfff5d9d0:      0x00000000:    destination address
0xfff5d9d4:      0x00000000:    immediate dword low
0xfff5d9d8:      0x00000000:    immediate dword high
Comment 10 yann 2016-11-04 15:33:00 UTC
Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.
Comment 11 Annie 2017-02-10 22:39:01 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.