Bug 59593 - [ilk] GPU hung / Failed to reset chip
[ilk] GPU hung / Failed to reset chip
Status: RESOLVED FIXED
Product: DRI
Classification: Unclassified
Component: DRM/Intel
XOrg git
x86 (IA32) Linux (All)
: medium major
Assigned To: Intel GFX Bugs mailing list
Intel GFX Bugs mailing list
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-19 18:50 UTC by Xorlogosh
Modified: 2013-04-02 08:35 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg part and i915_error_state output (1.48 MB, text/plain)
2013-01-19 18:50 UTC, Xorlogosh
no flags Details
dmesg with drm.debug=0xe (136.91 KB, text/plain)
2013-03-24 21:59 UTC, Martin Weinelt
no flags Details
/debug/dri/0/i915_error_state (1.48 MB, text/plain)
2013-03-24 22:00 UTC, Martin Weinelt
no flags Details
dmesg with drm.debug=0xe and the error happening (198.59 KB, text/plain)
2013-03-24 22:04 UTC, Martin Weinelt
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Xorlogosh 2013-01-19 18:50:11 UTC
Created attachment 73294 [details]
dmesg part and i915_error_state output

Graphics:  Card: Intel Core Processor Integrated Graphics Controller bus-ID: 00:02.0 
           X.Org: 1.13.0 driver: intel Resolution: 1366x768@60.0hz 
           GLX Renderer: Mesa DRI Intel Ironlake Mobile x86/MMX/SSE2 GLX Version: 2.1 Mesa 9.0 Direct Rendering: Yes

Kernel~3.8.0-994-generic i686

CPU~Dual core Intel Core i5 CPU U 520 (-HT-MCP-)

X freezes / crashes / sometimes System reboot.
Not possible to restart X.
Comment 1 Daniel Vetter 2013-01-19 18:58:46 UTC
Just to check: Was that error_state on latest drm-intel-nightly?
Comment 2 Xorlogosh 2013-01-19 19:06:00 UTC
yes, indeed from the nightly build of
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2013-01-19-raring/

On Sat, Jan 19, 2013 at 7:58 PM, <bugzilla-daemon@freedesktop.org> wrote:

>   *Comment # 1 <https://bugs.freedesktop.org/show_bug.cgi?id=59593#c1> on bug
> 59593 <https://bugs.freedesktop.org/show_bug.cgi?id=59593> from Daniel
> Vetter <daniel@ffwll.ch> *
>
> Just to check: Was that error_state on latest drm-intel-nightly?
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Daniel Vetter 2013-01-19 19:07:52 UTC
For reference:

commit 10a2b7e3a2d987283189a0d0f67ae133d922f5d4
Merge: 39e0371 7b4cf99
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Jan 18 22:18:16 2013 +0100

    Merge remote-tracking branch 'drm-upstream/drm-fixes' into drm-intel-nightly
Comment 4 Daniel Vetter 2013-01-19 19:17:42 UTC
Seems to have died at the end of a batch in a tri fan sequential prim op. Can you test what happens with latest mesa from xorg-edgers. Also, what kind of programs do you have running while the machine dies (desktop enviroment, compositor, any opengl apps, ...).
Comment 5 Daniel Vetter 2013-01-19 19:20:06 UTC
Also, can you please boot with drm.debug=0xe added to your kernel cmdline and then attach the complete dmesg? Just for reference so we know about your hw ...
Comment 6 Martin Weinelt 2013-03-24 11:42:29 UTC
Experiencing a similar issue on Arch Linux with

* xorg-server 1.14.0-2,
* intel-dri 9.1.1-1
* libva-intel-driver 1.0.19-1
* xf86-video-intel 2.21.5-1
* kernel 3.8.4-1
* gnome-shell 3.6.3.1-3

When using a combination of firefox / mplayer2 (with xv backend) the gnome-shell crashes and tells me to logout because it is unable to restart. I can log back into gnome, but graphics are distorted.

dmesg shows:
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
> [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
> [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
> [drm:i915_reset] *ERROR* Failed to reset chip.

Will attach i915_error_state and drm.debug=0xe-dmesg when it happens again.
Comment 7 Martin Weinelt 2013-03-24 21:59:09 UTC
Created attachment 76979 [details]
dmesg with drm.debug=0xe
Comment 8 Martin Weinelt 2013-03-24 22:00:05 UTC
Created attachment 76980 [details]
/debug/dri/0/i915_error_state
Comment 9 Martin Weinelt 2013-03-24 22:04:09 UTC
Created attachment 76981 [details]
dmesg with drm.debug=0xe and the error happening
Comment 10 Chris Wilson 2013-03-25 09:25:15 UTC
Death inside a mesa batchbuffer. Have you tried mesa-8.0.y?
Comment 11 Martin Weinelt 2013-03-25 09:45:03 UTC
No and I'm not really eager to downgrade.

% LANG=C sudo pacman -U mesa-8.0.4-3-86_64.pkg.tar.xz                                                      :(
loading packages...
warning: downgrading package mesa (9.1.1-1 => 8.0.4-3)
resolving dependencies...
looking for inter-conflicts...
error: failed to prepare transaction (could not satisfy dependencies)
:: cairo: requires mesa>=9.1
:: libva: requires libegl
:: mesa-libgl: requires mesa=9.1.1
Comment 12 Martin Weinelt 2013-03-25 09:49:01 UTC
I used this laptop with Ubuntu 12.04 though and everything was fine. They ship 8.0.4-0ubuntu0.2 there.
Comment 13 Jesse Barnes 2013-03-28 19:46:14 UTC
I wonder if this patch might help with the reset:

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0cfc778..1c53438 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -744,6 +744,7 @@ static int ironlake_do_reset(struct drm_device *dev)
 	int ret;
 
 	gdrst = I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR);
+	gdrst &= ~GRDOM_MASK;
 	I915_WRITE(MCHBAR_MIRROR_BASE + ILK_GDSR,
 		   gdrst | GRDOM_RENDER | GRDOM_RESET_ENABLE);
 	ret = wait_for(I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR) & 0x1, 500);
@@ -752,6 +753,7 @@ static int ironlake_do_reset(struct drm_device *dev)
 
 	/* We can't reset render&media without also resetting display ... */
 	gdrst = I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR);
+	gdrst &= ~GRDOM_MASK;
 	I915_WRITE(MCHBAR_MIRROR_BASE + ILK_GDSR,
 		   gdrst | GRDOM_MEDIA | GRDOM_RESET_ENABLE);
 	return wait_for(I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR) & 0x1, 500);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5e91fbb..95ad87c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -91,6 +91,7 @@
 #define  GRDOM_FULL	(0<<2)
 #define  GRDOM_RENDER	(1<<2)
 #define  GRDOM_MEDIA	(3<<2)
+#define  GRDOM_MASK	(3<<2)
 #define  GRDOM_RESET_ENABLE (1<<0)
 
 #define GEN6_MBCUNIT_SNPCR	0x900c /* for LLC config */
Comment 14 Daniel Vetter 2013-04-02 08:35:27 UTC
I'm an optimist and hope that Jesse's patch indeed fixes this:

commit 8a5c2ae753c588bcb2a4e38d1c6a39865dbf1ff3
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Mar 28 13:57:19 2013 -0700

    drm/i915: fix ILK GPU reset for render