Bug 80639 - GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue
Summary: GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, ...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) other
: medium major
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-28 16:56 UTC by Ralph Plawetzki
Modified: 2017-07-24 22:53 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Log file with errors (52.86 KB, text/plain)
2014-06-28 16:56 UTC, Ralph Plawetzki
no flags Details
/sys/class/drm/card0/error (758.81 KB, text/plain)
2014-06-29 05:34 UTC, Ralph Plawetzki
no flags Details
screen before loading X (332.75 KB, image/jpeg)
2014-07-03 14:16 UTC, djmatic8
no flags Details

Description Ralph Plawetzki 2014-06-28 16:56:33 UTC
Created attachment 101930 [details]
Log file with errors

I am running Arch Linux i686 on a Dell Inspiron 1525 laptop with an Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 0c)

As you might know, Arch is a rolling distro with the latest software. A recent upgrade of the installed packages made my display stay dark when I boot the system. 

The logfile has at Jun 28 17:07:43:
GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue

and

i915: render error detected, EIR: 0x00000010

One of these packages causes this: intel-dri-10.2.2-1  linux-3.15.2-1  mesa-10.2.2-1  mesa-libgl-10.2.2-1  xf86-video-intel-2.99.912-1

Unfortunately I cannot provide a /sys/class/drm/card0/error, because my display is not working at all. In order to make the laptop boot again, I have to boot a rescue system and downgrade the installed packages with pacman -U intel-dri-10.1.4-1-i686.pkg.tar.xz xf86-video-intel-2.99.911-2-i686.pkg.tar.xz mesa-10.1.4-1-i686.pkg.tar.xz mesa-libgl-10.1.4-1-i686.pkg.tar.xz linux-3.14.5-1-i686.pkg.tar.xz

Please let me know, if I can provide more info or do testing of any kind.

Thank you very much.
Comment 1 Chris Wilson 2014-06-28 17:15:33 UTC
Please attach /sys/class/drm/card0/error
Comment 2 Ralph Plawetzki 2014-06-28 18:18:59 UTC
I did some more investigation and upgraded to the latest packages again (intel-dri-10.2.2-1  linux-3.15.2-1  mesa-10.2.2-1  mesa-libgl-10.2.2-1 xf86-video-intel-2.99.912-1) which reproduces the issue and makes the display stay dark.

Then I only downgraded the kernel (3.15.2-1 => 3.14.5-1) and voilà: the system boots fine.

So the actual problem seems to be in the kernel, not in one of these: intel-dri-10.2.2-1 mesa-10.2.2-1  mesa-libgl-10.2.2-1 xf86-video-intel-2.99.912-1
Comment 3 Ralph Plawetzki 2014-06-28 18:19:15 UTC
(In reply to https://bugs.freedesktop.org/show_bug.cgi?id=80639#c1)
Thanks for your quick reply.
I could try to upgrade to the newer kernel that reproduces the issue and connect to the laptop via ssh in order to get /sys/class/drm/card0/error

I will do that tomorrow, as I do not have the time to do that now.
Comment 4 Ralph Plawetzki 2014-06-29 05:34:18 UTC
Created attachment 101958 [details]
/sys/class/drm/card0/error
Comment 5 Ralph Plawetzki 2014-06-29 05:35:53 UTC
(In reply to comment #1)
Here it is.
Connecting to the laptop via ssh worked like a charm.
Comment 6 Chris Wilson 2014-07-01 13:39:36 UTC
This has the hallmarks of being invalid stolen. Can you try:

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 78fa532..2bce530 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -330,6 +330,8 @@ int i915_gem_init_stolen(struct drm_device *dev)
        struct drm_i915_private *dev_priv = dev->dev_private;
        struct resource *r;
 
+       return 0;
+
 #ifdef CONFIG_INTEL_IOMMU
        if (intel_iommu_gfx_mapped && INTEL_INFO(dev)->gen < 8) {
                DRM_INFO("DMAR active, disabling use of stolen memory\n");
Comment 7 Ralph Plawetzki 2014-07-02 09:01:39 UTC
Sure.
I compiled an Arch kernel yesterday, but made a mistake, so that the patch has not been applied to the sources. So I will do another compile today.

The kernel source was 3.15.2. To apply your patch, the package build description file (PKGBUILD) needs to be modified.

This was the first time I compiled a kernel for Arch linux. I managed the patch file to be read and considered valid by PKGBUILD, but did not see that the PKGBUILD script had to be modified at another place, so that the patch actually gets applied to the sources.
Comment 8 Ralph Plawetzki 2014-07-02 20:16:51 UTC
Ok, your patch worked.

I could boot the laptop and the display is working.
The log did not contain errors any more.

$ uname -r
3.15.2-1-custom
Comment 9 djmatic8 2014-07-03 14:16:48 UTC
Created attachment 102200 [details]
screen before loading X

After loading Xorg screen gets blank and the message appears in syslog:
"kernel: [   12.428567] [drm] GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue" ...
Comment 10 djmatic8 2014-07-03 14:24:17 UTC
I found the commit that introduced this behaviour: 1ad292b51e358c8b6e9b8966889c21f1fe705489 (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git). In my case display was fixed after applying this:

--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11038,7 +11038,6 @@ static void intel_init_display(struct drm_device *dev)
                        i9xx_update_primary_plane;
        } else {
                dev_priv->display.get_pipe_config = i9xx_get_pipe_config;
-               dev_priv->display.get_plane_config = i9xx_get_plane_config;
                dev_priv->display.crtc_mode_set = i9xx_crtc_mode_set;
                dev_priv->display.crtc_enable = i9xx_crtc_enable;
                dev_priv->display.crtc_disable = i9xx_crtc_disable;
Comment 11 Ralph Plawetzki 2014-07-27 13:29:12 UTC
A patch has been committed to the kernel sources 18 days ago and should be contained in v3.16.

https://github.com/torvalds/linux/commit/f1e1c2129b79cfdaf07bca37c5a10569fe021abe

I'll test the new kernel when it is available as an arch package.
Comment 12 Ralph Plawetzki 2014-07-27 14:10:14 UTC
(In reply to comment #11)
Actually the patch is contained in v3.15.6
Comment 13 Ralph Plawetzki 2014-07-30 07:38:51 UTC
Today I installed kernel 3.15.7-1-ARCH as an arch package and rebooted.

The laptop started normally, no GPU crash occured.

The log file did not contain errors any more.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.