Bug 59593 - [ilk] GPU hung / Failed to reset chip
Summary: [ilk] GPU hung / Failed to reset chip
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86 (IA32) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-19 18:50 UTC by Xorlogosh
Modified: 2017-07-24 22:59 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg part and i915_error_state output (1.48 MB, text/plain)
2013-01-19 18:50 UTC, Xorlogosh
no flags Details
dmesg with drm.debug=0xe (136.91 KB, text/plain)
2013-03-24 21:59 UTC, Martin Weinelt
no flags Details
/debug/dri/0/i915_error_state (1.48 MB, text/plain)
2013-03-24 22:00 UTC, Martin Weinelt
no flags Details
dmesg with drm.debug=0xe and the error happening (198.59 KB, text/plain)
2013-03-24 22:04 UTC, Martin Weinelt
no flags Details

Description Xorlogosh 2013-01-19 18:50:11 UTC
Created attachment 73294 [details]
dmesg part and i915_error_state output

Graphics:  Card: Intel Core Processor Integrated Graphics Controller bus-ID: 00:02.0 
           X.Org: 1.13.0 driver: intel Resolution: 1366x768@60.0hz 
           GLX Renderer: Mesa DRI Intel Ironlake Mobile x86/MMX/SSE2 GLX Version: 2.1 Mesa 9.0 Direct Rendering: Yes

Kernel~3.8.0-994-generic i686

CPU~Dual core Intel Core i5 CPU U 520 (-HT-MCP-)

X freezes / crashes / sometimes System reboot.
Not possible to restart X.
Comment 1 Daniel Vetter 2013-01-19 18:58:46 UTC
Just to check: Was that error_state on latest drm-intel-nightly?
Comment 2 Xorlogosh 2013-01-19 19:06:00 UTC
yes, indeed from the nightly build of
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2013-01-19-raring/

On Sat, Jan 19, 2013 at 7:58 PM, <bugzilla-daemon@freedesktop.org> wrote:

>   *Comment # 1 <https://bugs.freedesktop.org/show_bug.cgi?id=59593#c1> on bug
> 59593 <https://bugs.freedesktop.org/show_bug.cgi?id=59593> from Daniel
> Vetter <daniel@ffwll.ch> *
>
> Just to check: Was that error_state on latest drm-intel-nightly?
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Daniel Vetter 2013-01-19 19:07:52 UTC
For reference:

commit 10a2b7e3a2d987283189a0d0f67ae133d922f5d4
Merge: 39e0371 7b4cf99
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Jan 18 22:18:16 2013 +0100

    Merge remote-tracking branch 'drm-upstream/drm-fixes' into drm-intel-nightly
Comment 4 Daniel Vetter 2013-01-19 19:17:42 UTC
Seems to have died at the end of a batch in a tri fan sequential prim op. Can you test what happens with latest mesa from xorg-edgers. Also, what kind of programs do you have running while the machine dies (desktop enviroment, compositor, any opengl apps, ...).
Comment 5 Daniel Vetter 2013-01-19 19:20:06 UTC
Also, can you please boot with drm.debug=0xe added to your kernel cmdline and then attach the complete dmesg? Just for reference so we know about your hw ...
Comment 6 Martin Weinelt 2013-03-24 11:42:29 UTC
Experiencing a similar issue on Arch Linux with

* xorg-server 1.14.0-2,
* intel-dri 9.1.1-1
* libva-intel-driver 1.0.19-1
* xf86-video-intel 2.21.5-1
* kernel 3.8.4-1
* gnome-shell 3.6.3.1-3

When using a combination of firefox / mplayer2 (with xv backend) the gnome-shell crashes and tells me to logout because it is unable to restart. I can log back into gnome, but graphics are distorted.

dmesg shows:
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
> [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
> [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
> [drm:i915_reset] *ERROR* Failed to reset chip.

Will attach i915_error_state and drm.debug=0xe-dmesg when it happens again.
Comment 7 Martin Weinelt 2013-03-24 21:59:09 UTC
Created attachment 76979 [details]
dmesg with drm.debug=0xe
Comment 8 Martin Weinelt 2013-03-24 22:00:05 UTC
Created attachment 76980 [details]
/debug/dri/0/i915_error_state
Comment 9 Martin Weinelt 2013-03-24 22:04:09 UTC
Created attachment 76981 [details]
dmesg with drm.debug=0xe and the error happening
Comment 10 Chris Wilson 2013-03-25 09:25:15 UTC
Death inside a mesa batchbuffer. Have you tried mesa-8.0.y?
Comment 11 Martin Weinelt 2013-03-25 09:45:03 UTC
No and I'm not really eager to downgrade.

% LANG=C sudo pacman -U mesa-8.0.4-3-86_64.pkg.tar.xz                                                      :(
loading packages...
warning: downgrading package mesa (9.1.1-1 => 8.0.4-3)
resolving dependencies...
looking for inter-conflicts...
error: failed to prepare transaction (could not satisfy dependencies)
:: cairo: requires mesa>=9.1
:: libva: requires libegl
:: mesa-libgl: requires mesa=9.1.1
Comment 12 Martin Weinelt 2013-03-25 09:49:01 UTC
I used this laptop with Ubuntu 12.04 though and everything was fine. They ship 8.0.4-0ubuntu0.2 there.
Comment 13 Jesse Barnes 2013-03-28 19:46:14 UTC
I wonder if this patch might help with the reset:

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0cfc778..1c53438 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -744,6 +744,7 @@ static int ironlake_do_reset(struct drm_device *dev)
 	int ret;
 
 	gdrst = I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR);
+	gdrst &= ~GRDOM_MASK;
 	I915_WRITE(MCHBAR_MIRROR_BASE + ILK_GDSR,
 		   gdrst | GRDOM_RENDER | GRDOM_RESET_ENABLE);
 	ret = wait_for(I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR) & 0x1, 500);
@@ -752,6 +753,7 @@ static int ironlake_do_reset(struct drm_device *dev)
 
 	/* We can't reset render&media without also resetting display ... */
 	gdrst = I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR);
+	gdrst &= ~GRDOM_MASK;
 	I915_WRITE(MCHBAR_MIRROR_BASE + ILK_GDSR,
 		   gdrst | GRDOM_MEDIA | GRDOM_RESET_ENABLE);
 	return wait_for(I915_READ(MCHBAR_MIRROR_BASE + ILK_GDSR) & 0x1, 500);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5e91fbb..95ad87c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -91,6 +91,7 @@
 #define  GRDOM_FULL	(0<<2)
 #define  GRDOM_RENDER	(1<<2)
 #define  GRDOM_MEDIA	(3<<2)
+#define  GRDOM_MASK	(3<<2)
 #define  GRDOM_RESET_ENABLE (1<<0)
 
 #define GEN6_MBCUNIT_SNPCR	0x900c /* for LLC config */
Comment 14 Daniel Vetter 2013-04-02 08:35:27 UTC
I'm an optimist and hope that Jesse's patch indeed fixes this:

commit 8a5c2ae753c588bcb2a4e38d1c6a39865dbf1ff3
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Mar 28 13:57:19 2013 -0700

    drm/i915: fix ILK GPU reset for render


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.