72089 – [865G] [SNA] GPU Hang: freeze: stuck on render ring

Bug 72089 - [865G] [SNA] GPU Hang: freeze: stuck on render ring

Summary: [865G] [SNA] GPU Hang: freeze: stuck on render ring

Status:	CLOSED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	unspecified
Hardware:	x86 (IA32) Linux (All)

Importance:	medium normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-11-27 23:03 UTC by Götz
Modified:	2017-07-24 22:56 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:	I865G
i915 features:	GPU hang

Attachments
i915_error_state (716.42 KB, text/plain) 2013-11-27 23:03 UTC, Götz	no flags	Details
Xorg.0.log (17.82 KB, text/plain) 2013-11-27 23:04 UTC, Götz	no flags	Details
intel_reg_dumper.txt (12.41 KB, text/plain) 2013-11-27 23:06 UTC, Götz	no flags	Details
GPU crash report from /sys/class/drm/card0/error (2.23 MB, text/plain) 2013-12-21 16:19 UTC, onorua	no flags	Details
full dmesg (20.67 KB, text/plain) 2013-12-21 16:20 UTC, onorua	no flags	Details
/sys/class/drm/card0/error after error (2.17 MB, text/plain) 2014-01-11 19:10 UTC, Oleksandr Natalenko	no flags	Details
dmesg (43.50 KB, text/plain) 2014-11-21 15:00 UTC, Eugene	no flags	Details
Xorg.0.log (19.60 KB, text/plain) 2014-11-21 15:01 UTC, Eugene	no flags	Details
/sys/class/drm/card0/error (676.23 KB, text/plain) 2014-11-21 15:04 UTC, Eugene	no flags	Details
/sys/class/drm/card0/error kernel 4.4.0 (696.20 KB, text/plain) 2016-01-19 23:14 UTC, Götz	no flags	Details
Show Obsolete (6) View All

Description Götz 2013-11-27 23:03:19 UTC

Created attachment 89922 [details]
i915_error_state

[865G] [SNA] freeze: stuck on render ring

Freeze when typing text in Firefox:

[ 3074.989708] [drm] stuck on render ring
[ 3074.989728] [drm] capturing error event; look for more information in /sys/class/drm/card0/error
[ 3074.994497] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x432000 ctx 0) at 0x432630
[ 3080.989708] [drm] stuck on render ring
[ 3080.989978] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x432000 ctx 0) at 0x432630

After some time (a couple of seconds after moving the mouse and conecting through SSH) a few corruption apeard on center of the screen (window decoration buttons). Some logs attached, but forgot to look at /sys/class/drm/card0/error which was suggested in dmesg.

Using SNA acceleration.

System environment:
-- chipset: 865G
-- system architecture: 32-bit
-- xf86-video-intel: 2.21.15
-- xserver: 1.14.4
-- mesa: 9.2.3
-- libdrm: 2.4.49
-- kernel: 3.12.1-1-ARCH
-- Linux distribution: Arch Linux

Comment 1 Götz 2013-11-27 23:04:03 UTC

Created attachment 89923 [details]
Xorg.0.log

Comment 2 Götz 2013-11-27 23:06:23 UTC

Created attachment 89924 [details]
intel_reg_dumper.txt

Comment 3 Chris Wilson 2013-11-27 23:48:22 UTC

There is a bogus command 0x494003a6 at batch offset 0x60c which is due to a misplaced word from 0x644. This has the hallmarks of either a very stange bug in pwrite or a hardware issue.

Comment 4 Götz 2013-12-03 19:04:41 UTC

If it's a hardware issue, will wait to see if another freeze happens, or maybe someone else has a similar problem. 

Greetings

Comment 5 Chris Wilson 2013-12-03 21:19:47 UTC

Let's wait and see if this reoccurs.

Comment 6 onorua 2013-12-21 16:17:03 UTC

I've faced with the similar issue, and it become more and more annoying, here is the message from dmesg:
=====
[drm] stuck on render ring
[drm] GPU crash dump saved to /sys/class/drm/card0/error
[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x7728000 ctx 5) at 0x7728220
=====
I'll attach suggested error dump in a couple of minutes. Since one month I'm facing the problem when the everything is completely stuck on my screen. At first I tought that's something related to my windows manager, but during dubuging session, I've found the messages pasted above. 
I have updated to 3.13.0-rc4 kernel, limited amount of GPU in system to 1 in kernel. 
This occur at least once per day. 
Please let me know what kind of info is needed and I'll get it for you.

Comment 7 onorua 2013-12-21 16:19:09 UTC

Created attachment 91093 [details]
GPU crash report from /sys/class/drm/card0/error

Comment 8 onorua 2013-12-21 16:20:00 UTC

Created attachment 91094 [details]
full dmesg

Comment 9 Chris Wilson 2013-12-22 09:52:36 UTC

(In reply to comment #6)
> I've faced with the similar issue, and it become more and more annoying,
> here is the message from dmesg:

That's a completely different and known bug in mesa, try upgrading.

Comment 10 Oleksandr Natalenko 2014-01-11 19:09:29 UTC

Faced this crap too. Here is my dmesg:

===
[73578.463903] [drm] stuck on render ring
[73578.463910] [drm] capturing error event; look for more information in /sys/class/drm/card0/error
[73578.468259] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x3d3b9000 ctx 1) at 0x3d3bcb7c
===

lspci | grep VGA:

===
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
===

uname -a:

===
Linux spock 3.12.4-pf #1 SMP PREEMPT Fri Jan 10 01:18:40 EET 2014 x86_64 GNU/Linux
===

Arch Linux, mesa 10.0.1, xf86-video-intel 2.21.15.

Will attach /sys/class/drm/card0/error as well.

Comment 11 Oleksandr Natalenko 2014-01-11 19:10:10 UTC

Created attachment 91871 [details]
/sys/class/drm/card0/error after error

Comment 12 Chris Wilson 2014-01-11 21:03:37 UTC

(In reply to comment #11)
> Created attachment 91871 [details]
> /sys/class/drm/card0/error after error

You have a completely different GPU and a (completely different) known issue in mesa. Please always file a new bug report and let us sort out the duplicate issues.

Comment 13 Rodrigo Vivi 2014-09-24 20:28:04 UTC

Is this still happening on latest drm-intel-nightly?

What happens with i915.enable_rc6=0 ?

Comment 14 Eugene 2014-11-21 14:56:26 UTC

It seems I have the same issue on a several machines with similar GPUs. After system is started all is ok. But if to leave a machine and wait till display turns off, after that trying to awake it gives nothing. Display stays black. And only switching to any of VTs and turning back to VT7 makes display awaken causing the image appearing. But then in dmesg appears next messages:

[  143.476015] [drm] GPU HANG: ecode 2:-1:0x00000000, reason: Command parser error, iir 0x00008040, action: continue
[  143.476015] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  143.476015] i915: render error detected, EIR: 0x00000010
[  143.482472] [drm] GPU HANG: ecode 2:-1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue
[  143.482472] i915: render error detected, EIR: 0x00000010

After that I tried drm-intel-nightly but it didn't helped: GPU HANG also appeared after display was turned off. Saved error dump, dmesg, Xorg.0.log, please, see in attachment.

After that I tried to boot with i915.enable_rc6=0 kernel option. The thing is that with this option display doesn't turns off at all. And in accordance to it GPU HANG also not appears.

So. Yes, it still happening on latest drm-intel-nightly and no, with i915.enable_rc6=0 kernel option it doesn’t happening because display doesn't go into standby mode.

Comment 15 Eugene 2014-11-21 15:00:44 UTC

Created attachment 109800 [details]
dmesg

dmesg using drm-intel-nightly

Comment 16 Eugene 2014-11-21 15:01:30 UTC

Created attachment 109801 [details]
Xorg.0.log

Xorg.0.log using drm-intel-nightly

Comment 17 Eugene 2014-11-21 15:04:51 UTC

Created attachment 109802 [details]
/sys/class/drm/card0/error

GPU crash dump when using drm-intel-nightly

Comment 18 Eugene 2014-11-21 15:22:52 UTC

A small update. Recently discovered that display is turned off and again can't awake from that state. So it seems there is no differences with or without i915.enable_rc6=0 kernel option.

Comment 19 Chris Wilson 2014-11-21 16:24:06 UTC

(In reply to Eugene from comment #14)
> It seems I have the same issue on a several machines with similar GPUs.

It's a complete different bug. Seems like it is complaining about the cursor...

Comment 20 Eugene 2014-11-21 18:18:33 UTC

(In reply to Chris Wilson from comment #19)
> (In reply to Eugene from comment #14)
> > It seems I have the same issue on a several machines with similar GPUs.
> 
> It's a complete different bug. Seems like it is complaining about the
> cursor...

What cursor? Do you mean I need to write a new bug report?

Comment 21 Eugene 2014-11-22 17:03:51 UTC

I created a new report here: https://bugs.freedesktop.org/show_bug.cgi?id=86583

Comment 22 Jani Nikula 2016-01-18 13:00:22 UTC

Please try kernel v4.4.

Comment 23 Götz 2016-01-19 23:14:29 UTC

Created attachment 121145 [details]
/sys/class/drm/card0/error kernel 4.4.0

With Kernel 4.4.0 and xf86-video-intel 2.99.917+519+g8229390 (both from Arch Linux)

[  469.990009] [drm] stuck on render ring
[  469.995039] [drm] GPU HANG: ecode 2:0:0x476f7fc1, in Xorg [339], reason: Ring hung, action: reset
[  469.995044] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  469.995046] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  469.995048] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  469.995050] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  469.995052] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  469.996128] drm/i915: Resetting chip after gpu hang
[  469.996180] [drm:i915_reset [i915]] *ERROR* Failed to reset chip: -19

Comment 24 Götz 2016-02-08 02:10:15 UTC

Now with Kernel 4.4.1 and latest xf86-video-intel 2.99.917+544+g8b8c9a3 I don't get more GPU hangs after one day of usage. Previously just opening Konsole triggered a GPU hung.

Comment 25 Götz 2016-07-07 19:38:27 UTC

I haven't seen this GPU hang any more with the latest software versions.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.