Bug 34018

Summary: [i965gm] *Recovered* GPU lockup if vesafb is left loaded (EIR: 0x00000010 PGTBL_ER: 0x00000100) - *ERROR* EIR stuck: 0x00000010, masking
Product: xorg Reporter: Bryce Harrington <bryce>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: amaranth
Version: 7.6 (2010.12)Keywords: regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
XorgLog.txt
none
BootDmesg.txt
none
CurrentDmesg.txt
none
i915_error_state.txt
none
BootDmesg.txt
none
CurrentDmesg.txt
none
XorgLog.txt
none
i915_error_state.txt none

Description Bryce Harrington 2011-02-07 19:11:44 UTC
Forwarding this bug from Ubuntu reporter Travis Watkins:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/702090

[Problem]
In Ubuntu we're seeing a spattering of GPU lockups during boot that recover.  The user doesn't notice anything wrong (or just a black screen or other minor symptom), but the gpu hang is detected by apport and collected as a bug report.

Analysis indicates that if vesafb is disabled, the phantom(?) gpu lockups don't occur, thus suggesting something related to that.

This bug report was filed against the 2.6.37 kernel but verified against current X bits; I'm forwarding it because it was the first instance of this issue reported within the current development period, and because it has verified the workaround of disabling vesafb.  Other reporters have reported very similar bugs with the 2.6.38 kernel (same gpu dump error codes, same symptoms, same gpu).  

[Original Description]
I'm assuming this is because my suspend failed but I also had a black screen on boot until X started so perhaps it was something plymouth related.

From GPU dump:
ACTHD: 0x00000000
EIR: 0x00000010
EMR: 0xffffffdd
ESR: 0x00000010
PGTBL_ER: 0x00000100
IPEHR: 0x00000000
IPEIR: 0x00000000
INSTDONE: 0xffe5fafe
INSTDONE1: 0x000fffff
    busy: Projection and LOD
    busy: Bypass FIFO
    busy: Color calculator

From dmesg:
[   12.615203] [drm] initialized overlay support
[   12.838827] render error detected, EIR: 0x00000010
[   12.838830] page table error
[   12.838831]   PGTBL_ER: 0x00000100
[   12.838834] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking


ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.37-12-generic root=UUID=3fa2b187-0edf-45a7-add7-17bf24470864 ro vt.handoff=7 quiet splash
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.37-12-generic root=UUID=3fa2b187-0edf-45a7-add7-17bf24470864 ro vt.handoff=7 quiet splash
RelatedPackageVersions:
 xserver-xorg 1:7.5+6ubuntu7
 libdrm2 2.4.22-2ubuntu1
 xserver-xorg-video-intel 2:2.13.901-2ubuntu2
dmi.bios.date: 02/09/08
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MB41.88Z.00C1.B00.0802091535
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Mac-F22788A9
dmi.board.vendor: Apple Inc.
dmi.board.version: PVT
dmi.chassis.asset.tag: Asset Tag#
dmi.chassis.type: 2
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-F22788A9
dmi.product.name: MacBook4,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.
version.libdrm2: libdrm2 2.4.22-2ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 7.9+repack-1ubuntu3
version.xserver-xorg: xserver-xorg 1:7.5+6ubuntu7
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.13.901-2ubuntu2
Comment 1 Bryce Harrington 2011-02-07 19:12:57 UTC
Created attachment 43078 [details]
XorgLog.txt
Comment 2 Bryce Harrington 2011-02-07 19:13:49 UTC
Created attachment 43079 [details]
BootDmesg.txt
Comment 3 Bryce Harrington 2011-02-07 19:14:16 UTC
Created attachment 43080 [details]
CurrentDmesg.txt
Comment 4 Bryce Harrington 2011-02-07 19:17:19 UTC
Created attachment 43081 [details]
i915_error_state.txt

The user ruled out the aforementioned suspicion of a suspend issue by reproducing the lockup after a fresh boot.

Here is the GPU dump:  https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/702090/+attachment/1792167/+files/IntelGpuDump.txt

Note that the gpu dump and error state files were collected from an older X stack than the dmesg and Xorg.0.log.
Comment 5 Bryce Harrington 2011-02-07 19:22:17 UTC
The user still sees the bug with the .38 kernel, 2.14.0 -intel, and current X stack.

Following will be a set of files from the same user, same hardware, same issue but with this more current set of software versions.
Comment 6 Bryce Harrington 2011-02-07 19:23:39 UTC
Created attachment 43082 [details]
BootDmesg.txt
Comment 7 Bryce Harrington 2011-02-07 19:24:06 UTC
Created attachment 43083 [details]
CurrentDmesg.txt
Comment 8 Bryce Harrington 2011-02-07 19:24:44 UTC
Created attachment 43084 [details]
XorgLog.txt
Comment 10 Bryce Harrington 2011-02-07 19:31:28 UTC
Here are some other bugs reported to Ubuntu which I suspect to be dupes of this:

https://bugs.launchpad.net/bugs/713794
https://bugs.launchpad.net/bugs/627011
https://bugs.launchpad.net/bugs/714805
https://bugs.launchpad.net/bugs/711275
(and half a dozen others linked to #711275)
Comment 11 Chris Wilson 2011-02-08 01:46:30 UTC
Don't use two conflicting drivers for the same hardware.
Comment 12 Bryce Harrington 2011-03-18 19:18:02 UTC
The underlying problem of conflicts with vesafb may or may not still be worth further investigation but the surface problem of gpu lockups, which seems specific to ubuntu, has been resolved in ubuntu.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.