Bug 82024 - [gm45] GPU hang in dmesg (with dump)
Summary: [gm45] GPU hang in dmesg (with dump)
Status: CLOSED WONTFIX
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-01 21:03 UTC by Pedro Ribeiro
Modified: 2017-07-24 22:52 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
/sys/class/drm/card0/error (1.32 MB, text/plain)
2014-08-01 21:03 UTC, Pedro Ribeiro
no flags Details

Description Pedro Ribeiro 2014-08-01 21:03:24 UTC
Created attachment 103832 [details]
/sys/class/drm/card0/error

Hi,

I got this message in dmesg:
[ 9665.185046] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 9665.185050] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 9665.185052] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 9665.185054] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 9665.185055] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 9665.186008] i915: render error detected, EIR: 0x00000010
[ 9665.186008] i915:   IPEIR: 0x00000000
[ 9665.186008] i915:   IPEHR: 0x01000000
[ 9665.186008] i915:   INSTDONE_0: 0xfffffffe
[ 9665.186008] i915:   INSTDONE_1: 0xffffffff
[ 9665.186008] i915:   INSTDONE_2: 0x00000000
[ 9665.186008] i915:   INSTDONE_3: 0x00000000
[ 9665.186008] i915:   INSTPS: 0x0001e000
[ 9665.186008] i915:   ACTHD: 0x14c121c8
[ 9665.186008] i915: page table error
[ 9665.186008] i915:   PGTBL_ER: 0x00000001
[ 9665.186008] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking

I'm using the latest 3.14.14 with the grsecurity patch.

Please let me know if you need more information.

Regards,
Pedro
Comment 1 Rodrigo Vivi 2014-10-08 21:18:58 UTC
Could you please try to reproduce the error on latest drm-intel-nightly and attach new error state?
Comment 2 Pedro Ribeiro 2014-10-18 21:12:42 UTC
Hi Rodrigo,

I've been using 3.17 from drm-intel-nightly for 10 days now and I haven't seen the problem yet.

The kernel I'm using has this as the last commit:

commit 
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Oct 9 09:59:04 2014 +0200

    drm-intel-nightly: 2014y-10m-09d-07h-58m-45s UTC integration manifest


So it seems it was fixed somehow. 

Is there a way I can find out which patch fixed it so that it can be backported to 3.14?

And what do these errors mean, are they anything of concern? Apart from the dmesg messages, I didn't notice any problems when using my laptop - maybe it was overheating a bit, but that's probably unrelated.
Comment 3 Mika Kuoppala 2014-10-20 11:02:45 UTC
(In reply to Pedro Ribeiro from comment #2)
> 
> Is there a way I can find out which patch fixed it so that it can be
> backported to 3.14?
> 

If you can reliably reproduce the issue, the rest is just bisecting:
https://wiki.debian.org/DebianKernel/GitBisect
Comment 4 Rodrigo Vivi 2014-10-20 17:06:35 UTC
Hi Pedro,

The error you got was a gpu hang. It can be caused by many things in the stack. So depends a lot on what workload you had at that time.
Since you didn't noticed anything it was probably a very quickly hang with fast recover. So gpu came back to live reseting fast enough that you couldn't notice. But it is an error.

So I believe you could just double check trying to use same workload you had before on latest nightly. And i you don't see the only way to find out what fixed is the bisecting as Mika suggested.

But if you don't reproduce this on -nightly anymore I believe we can close this bug.
Comment 5 Pedro Ribeiro 2014-10-20 20:52:10 UTC
It's hard to say how to trigger it to be honest. I did notice my window manager (Openbox) acting funny a bit at times, but I didn't think it was related (it might be).

Please keep it open for a couple more weeks while I investigate.
Comment 6 Rodrigo Vivi 2015-01-15 00:12:54 UTC
timeout. Feel free to reopen if you are still able to reproduce on recent kernels updating new logs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.