Bug 80199 - [hsw] GPU HANG: ecode 0:0x85dffffd stuck on render ring
Summary: [hsw] GPU HANG: ecode 0:0x85dffffd stuck on render ring
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-18 18:03 UTC by webstrand
Modified: 2017-07-24 22:53 UTC (History)
3 users (show)

See Also:
i915 platform: HSW
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (2.67 MB, text/plain)
2014-06-18 18:03 UTC, webstrand
no flags Details
/sys/class/drm/card0/error #2 (65.01 KB, application/octet-stream)
2014-07-30 13:34 UTC, Martin Andersen
no flags Details
/sys/class/drm/card0/error.gz (418.26 KB, text/plain)
2015-01-14 19:50 UTC, webstrand
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description webstrand 2014-06-18 18:03:45 UTC
Created attachment 101318 [details]
/sys/class/drm/card0/error

[   85.435367] [drm] stuck on render ring
[   85.436107] [drm] GPU HANG: ecode 0:0x85dffffd, in X [593], reason: Ring hung, action: reset
[   85.436111] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   85.436113] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   85.436114] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   85.436116] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   85.436118] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   87.437121] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[   91.367204] [drm] stuck on render ring
[   91.368537] [drm] GPU HANG: ecode 0:0x85dffffd, in X [593], reason: Ring hung, action: reset
[   91.368699] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!
[   93.369002] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[  230.516279] [drm] stuck on render ring
[  230.517059] [drm] GPU HANG: ecode 0:0x85dffffd, in glxspheres64 [3870], reason: Ring hung, action: reset
[  232.517981] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[  236.528072] [drm] stuck on render ring
[  236.529124] [drm] GPU HANG: ecode 0:0x85dfbffb, in glxspheres64 [3870], reason: Ring hung, action: reset
[  238.529891] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off


My distribution is ArchLinux with stock kernel 3.15.1-1. I do not see any graphical corruption, the visual display freezes for a few seconds.

The last pre-compiled kernel that does not express this hang is 3.12.9-2.
Comment 1 Chris Wilson 2014-06-18 18:18:18 UTC
That's different. It died after the second CA glyph pass. The batch is free of the most recent errors and doesn't have any corruption. Is this a regression (from 3.14)? Would you be able to bisect?
Comment 2 webstrand 2014-06-18 23:38:07 UTC
(In reply to comment #1)
> That's different. It died after the second CA glyph pass. The batch is free
> of the most recent errors and doesn't have any corruption. Is this a
> regression (from 3.14)? Would you be able to bisect?

I don't know if this is a recent regression or not. I just updated my kernel from 3.12.9 to 3.15.1 because I encountered similar crashes with the intermediary kernels.

I can bisect, but I do not know what range I should bisect over.
Comment 3 Kees Bakker 2014-07-13 19:29:52 UTC
Is it OK to attach my card0/error here too? Or should I open a new bug report?
I'm seeing this error since a few weeks now, probably since I installed Ubuntu 14.04
(which has kernel 3.10.x). In the meantime I installed 3.14 and 3.15 but the
bug keeps popping up. Here is a syslog from a few moments ago.

When this happens I see some windows show funny characters, and some other
artefacts. And the system is inresponsive for a few seconds.

Jul 13 21:15:40 rapper kernel: [258170.239604] [drm] stuck on render ring
Jul 13 21:15:40 rapper kernel: [258170.240267] [drm] GPU HANG: ecode 0:0xf5fffffe, in Xorg [1854], reason: Ring hung, action: reset
Jul 13 21:15:40 rapper kernel: [258170.240269] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jul 13 21:15:40 rapper kernel: [258170.240270] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jul 13 21:15:40 rapper kernel: [258170.240270] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jul 13 21:15:40 rapper kernel: [258170.240271] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jul 13 21:15:40 rapper kernel: [258170.240272] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jul 13 21:15:42 rapper kernel: [258172.242543] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
Comment 4 Martin Andersen 2014-07-30 13:30:06 UTC
Don't mean to hijack this bugreport in any way, but I wanted to mention that I am also seeing the same/similar behaviour on my Ivy Bridge / HD4000 (i3-3225) system, using Ubuntu's mainline kernel 3.15.6-031506-generic

Posting here as I didn't want to create a duplicate bug report. 
(let me know if I actually should do just that.)

[Mon Jul 28 21:36:47 2014] [drm] stuck on render ring
[Mon Jul 28 21:36:47 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Mon Jul 28 21:36:47 2014] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[Mon Jul 28 21:36:47 2014] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[Mon Jul 28 21:36:47 2014] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[Mon Jul 28 21:36:47 2014] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[Mon Jul 28 21:36:47 2014] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[Mon Jul 28 21:36:49 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
[Tue Jul 29 00:32:59 2014] [drm] stuck on render ring
[Tue Jul 29 00:32:59 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Tue Jul 29 00:33:01 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
[Tue Jul 29 07:53:25 2014] [drm] stuck on render ring
[Tue Jul 29 07:53:25 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Tue Jul 29 07:53:27 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
[Tue Jul 29 09:21:32 2014] [drm] stuck on render ring
[Tue Jul 29 09:21:32 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Tue Jul 29 09:21:34 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off

[...]
[Tue Jul 29 19:38:10 2014] [drm] stuck on render ring
[Tue Jul 29 19:38:10 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Tue Jul 29 19:38:12 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
[Tue Jul 29 21:06:16 2014] [drm] stuck on render ring
[Tue Jul 29 21:06:16 2014] [drm] GPU HANG: ecode 0:0xf5f7fffe, in cinnamon [2586], reason: Ring hung, action: reset
[Tue Jul 29 21:06:18 2014] [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off
Comment 5 Martin Andersen 2014-07-30 13:34:55 UTC
Created attachment 103684 [details]
/sys/class/drm/card0/error #2

GPU crash dump associated with the previous dmesg entries
Comment 6 Chris Wilson 2014-07-30 19:53:07 UTC
(In reply to comment #4)
> Don't mean to hijack this bugreport in any way, but I wanted to mention that
> I am also seeing the same/similar behaviour on my Ivy Bridge / HD4000
> (i3-3225) system, using Ubuntu's mainline kernel 3.15.6-031506-generic
> 
> Posting here as I didn't want to create a duplicate bug report. 
> (let me know if I actually should do just that.)

In future, please do create new bug reports as deciding whether it is a duplicate is fairly tricky. In this case, you actually have bug 77104.
Comment 7 Chris Wilson 2014-12-21 20:05:12 UTC
Is there any chance you can retest with the latest kernel or bisect the change that introduced the hangs between 3.12 and 3.15?
Comment 8 webstrand 2015-01-14 19:48:50 UTC
On the current mainline kernel, 3.19-rc4 as of 2015-01-11, I still get the gpu hang through a few different ways:

[   48.398180] [drm] stuck on render ring
[   48.398984] [drm] GPU HANG: ecode 7:0:0x85dffffd, in Xorg.bin [354], reason: Ring hung, action: reset
[   48.398988] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   48.398989] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   48.398991] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   48.398993] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   48.398994] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   48.399059] [drm:intel_pipe_set_base [i915]] *ERROR* pin & fence failed
[   48.404834] drm/i915: Resetting chip after gpu hang
[   54.389676] [drm] stuck on render ring
[   54.390431] [drm] GPU HANG: ecode 7:0:0x85dffffd, in Xorg.bin [354], reason: Ring hung, action: reset
[   54.390500] [drm:intel_pipe_set_base [i915]] *ERROR* pin & fence failed
[   54.390513] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
[   54.396341] drm/i915: Resetting chip after gpu hang

I can trigger this error only once after rebooting and then running the command:

xrandr --output HDMI1 --off --output VIRTUAL1 --off --output eDP1 --mode 1920x1080 --pos 0x0 --rotate normal --output VGA1 --off

Starting chromium without the flag --disable-gpu always produces the error:

[  561.799135] [drm] stuck on render ring
[  561.799939] [drm] GPU HANG: ecode 7:0:0x85dfbff8, in chromium [917], reason: Ring hung, action: reset
[  561.805793] drm/i915: Resetting chip after gpu hang
Comment 9 webstrand 2015-01-14 19:50:54 UTC
Created attachment 112251 [details]
/sys/class/drm/card0/error.gz

I'd had to gzip the file because it was over the file size limit.
Comment 10 Jani Nikula 2015-10-23 09:51:57 UTC
Timeout, closing. Please reopen if the problem persists with latest kernels.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.