Bug 44886 - [gm45] GPU locks up for a few seconds then recovers when using xscreensaver on random: "Invalid GTT entry during fetch for host"
Summary: [gm45] GPU locks up for a few seconds then recovers when using xscreensaver o...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Daniel Vetter
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-18 00:59 UTC by Bryce Harrington
Modified: 2017-07-24 23:03 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
BootDmesg.txt (52.83 KB, text/plain)
2012-01-18 01:01 UTC, Bryce Harrington
no flags Details
CurrentDmesg.txt (123.28 KB, text/plain)
2012-01-18 01:01 UTC, Bryce Harrington
no flags Details
i915_error_state.txt (1.48 MB, text/plain)
2012-01-18 01:01 UTC, Bryce Harrington
no flags Details
XorgLog.txt (36.62 KB, text/plain)
2012-01-18 01:01 UTC, Bryce Harrington
no flags Details

Description Bryce Harrington 2012-01-18 00:59:29 UTC
Forwarding this bug from Ubuntu reporter Beto1917:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/899303

[Problem]
With xscreensaver set to random, once in a while it freezes for a couple seconds, recovers, and then displays the Apport window to report the lockup.  The i915_error_state file and dmesg indicate that a GPU lockup occurred, but the system recovered.

[Original Description]
crashed after xscreensaver started

ProblemType: Crash
DistroRelease: Ubuntu 12.04
Package: xserver-xorg-video-intel 2:2.15.901-1ubuntu3
ProcVersionSignature: Ubuntu 3.2.0-2.6-generic 3.2.0-rc3
Uname: Linux 3.2.0-2-generic i686
.tmp.unity.support.test.0:
 
ApportVersion: 1.90-0ubuntu1
Architecture: i386
Chipset: gm45
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
Date: Fri Dec  2 14:37:51 2011
DistUpgraded: Log time: 2011-07-07 16:41:03.123900
DistroCodename: precise
DistroVariant: ubuntu
DuplicateSignature: [gm45] GPU lockup  EIR: 0x00000010 PGTBL_ER: 0x00000001 render.IPEHR: 0x60020100 Ubuntu 12.04
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in Ubuntu
GraphicsCard:
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Acer Incorporated [ALI] Device [1025:029b]
   Subsystem: Acer Incorporated [ALI] Device [1025:029b]
InstallationMedia: Ubuntu-Netbook 10.04 "Lucid Lynx" - Release i386 (20100429.4)
InterpreterPath: /usr/bin/python2.7
MachineType: Acer Aspire 1410
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:
 
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-2-generic root=UUID=34529bb0-121c-4f52-b1b7-6aca0f1c8e88 ro splash vga=789 quiet splash vt.handoff=7
RelatedPackageVersions:
 xserver-xorg             1:7.6+7ubuntu7
 libdrm2                  2.4.27-1ubuntu1
 xserver-xorg-video-intel 2:2.15.901-1ubuntu3
SourcePackage: xserver-xorg-video-intel
Title: [gm45] False GPU lockup  EIR: 0x00000010 PGTBL_ER: 0x00000001 render.IPEHR: 0x60020100
UpgradeStatus: Upgraded to precise on 2011-12-01 (1 days ago)
UserGroups:
 
dmi.bios.date: 09/08/2009
dmi.bios.vendor: INSYDE
dmi.bios.version: v0.3117
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Base Board Product Name
dmi.board.vendor: Acer
dmi.board.version: Base Board Version
dmi.chassis.type: 1
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDE:bvrv0.3117:bd09/08/2009:svnAcer:pnAspire1410:pvrv0.3117:rvnAcer:rnBaseBoardProductName:rvrBaseBoardVersion:cvnChassisManufacturer:ct1:cvrChassisVersion:
dmi.product.name: Aspire 1410
dmi.product.version: v0.3117
dmi.sys.vendor: Acer
version.compiz: compiz 1:0.9.6+bzr20110929-0ubuntu7
version.libdrm2: libdrm2 2.4.27-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 7.11-0ubuntu4
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 7.11-0ubuntu4
version.xserver-xorg-core: xserver-xorg-core 2:1.10.4-1ubuntu5
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.6.0-1ubuntu13
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20110811.g93fc084-0ubuntu1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.15.901-1ubuntu3
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110411+8378443-1
Comment 1 Bryce Harrington 2012-01-18 01:01:02 UTC
Created attachment 55718 [details]
BootDmesg.txt
Comment 2 Bryce Harrington 2012-01-18 01:01:27 UTC
Created attachment 55719 [details]
CurrentDmesg.txt
Comment 3 Bryce Harrington 2012-01-18 01:01:40 UTC
Created attachment 55720 [details]
i915_error_state.txt
Comment 4 Bryce Harrington 2012-01-18 01:01:52 UTC
Created attachment 55721 [details]
XorgLog.txt
Comment 5 Chris Wilson 2012-04-16 04:53:21 UTC
This is likely to be:

commit c501ae7f332cdaf42e31af30b72b4b66cbbb1604
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Dec 14 13:57:23 2011 +0100

    drm/i915: Only clear the GPU domains upon a successful finish
    
    By clearing the GPU read domains before waiting upon the buffer, we run
    the risk of the wait being interrupted and the domains prematurely
    cleared. The next time we attempt to wait upon the buffer (after
    userspace handles the signal), we believe that the buffer is idle and so
    skip the wait.
    
    There are a number of bugs across all generations which show signs of an
    overly haste reuse of active buffers.
    
    Such as:
    
      https://bugs.freedesktop.org/show_bug.cgi?id=29046
      https://bugs.freedesktop.org/show_bug.cgi?id=35863
      https://bugs.freedesktop.org/show_bug.cgi?id=38952
      https://bugs.freedesktop.org/show_bug.cgi?id=40282
      https://bugs.freedesktop.org/show_bug.cgi?id=41098
      https://bugs.freedesktop.org/show_bug.cgi?id=41102
      https://bugs.freedesktop.org/show_bug.cgi?id=41284
      https://bugs.freedesktop.org/show_bug.cgi?id=42141
    
    A couple of those pre-date i915_gem_object_finish_gpu(), so may be
    unrelated (such as a wild write from a userspace command buffer), but
    this does look like a convincing cause for most of those bugs.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: stable@kernel.org
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.