Bug 35198

Summary: [gm45] GPU hang (after resume)
Product: xorg Reporter: Bryce Harrington <bryce>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED INVALID QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: ilan
Version: 7.6 (2010.12)   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
XorgLog.txt
none
CurrentDmesg.txt
none
BootDmesg.txt none

Description Bryce Harrington 2011-03-10 23:29:36 UTC
Forwarding this bug from Ubuntu reporter Ilan:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/733073

[Problem]
Laptop went to sleep after reaching the idle time set for shutting off the display.  When attempting to wake the computer from sleep rather than providing the expected gnome-screensaver login prompt, the screen remained black until the system was rebooted.  I was able to switch to console, but attempts to kill/stop/restart X were unresponsive.

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-4ubuntu1
ProcVersionSignature: Ubuntu 2.6.38-6.34-generic 2.6.38-rc7
Uname: Linux 2.6.38-6-generic x86_64
Architecture: amd64
Chipset: gm45
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
Date: Thu Mar 10 20:08:41 2011
DistUpgraded: Log time: 2011-02-28 21:59:48.714207
DistroCodename: natty
DistroVariant: ubuntu
DumpSignature: 818ac509 ()
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
GraphicsCard:
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Dell Device [1028:0233]
   Subsystem: Dell Device [1028:0233]
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
InterpreterPath: /usr/bin/python2.7
MachineType: Dell Inc. Latitude E6400
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:
 
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-6-generic root=UUID=5de52be4-136e-4782-a03b-6257e7eccac0 ro quiet splash vt.handoff=7
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.38-6-generic root=UUID=5de52be4-136e-4782-a03b-6257e7eccac0 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 xserver-xorg             1:7.6~3ubuntu11
 libdrm2                  2.4.23-1ubuntu3
 xserver-xorg-video-intel 2:2.14.0-4ubuntu1
Renderer: Unknown
SourcePackage: xserver-xorg-video-intel
Title: [gm45] GPU lockup 818ac509 ()
UpgradeStatus: Upgraded to natty on 2011-03-02 (8 days ago)
UserGroups:
 
dmi.bios.date: 08/19/2010
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A27
dmi.board.name: 0W620R
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA27:bd08/19/2010:svnDellInc.:pnLatitudeE6400:pvr:rvnDellInc.:rn0W620R:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude E6400
dmi.sys.vendor: Dell Inc.
i915_error_state: Error: [Errno 12] Cannot allocate memory
version.compiz: compiz 1:0.9.4-0ubuntu4
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.1-0ubuntu1
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu11
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.0-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu5
Comment 2 Bryce Harrington 2011-03-10 23:31:57 UTC
Created attachment 44340 [details]
XorgLog.txt
Comment 3 Bryce Harrington 2011-03-10 23:32:16 UTC
Created attachment 44341 [details]
CurrentDmesg.txt
Comment 4 Bryce Harrington 2011-03-10 23:32:35 UTC
Created attachment 44342 [details]
BootDmesg.txt
Comment 5 Chris Wilson 2011-03-11 07:58:16 UTC
We failed to read the i915_error_state and intel_gpu_dump is just useless, so we have no information as to what went wrong.
Comment 6 Bryce Harrington 2011-03-11 19:18:28 UTC
(In reply to comment #5)
> We failed to read the i915_error_state and intel_gpu_dump is just useless, so
> we have no information as to what went wrong.

Yeah I know, but we're getting a number of bug reports like this now.  How do you want us to collect the information you need?
Comment 7 Chris Wilson 2011-03-15 06:28:39 UTC
I think the ultimate solution is to make the i915_error_state not use seqfile and a more robust show(). If you can free up some memory and try again (such as killall -9 X; cat i915_error_state) that usually works for me.
Comment 8 Bryce Harrington 2011-03-18 12:40:44 UTC
I think you're right; a bit more poking shows that in these cases i915_error_state fails due to oom.

https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/542731

You can close this bug report, unless you'd like to use it for tracking the tool failure during oom.  The freeze bug itself is probably a dupe of one of the gm45 bugs I'm tracking separately, and which we're about to put in a fix for.
Comment 9 Chris Wilson 2011-03-20 04:00:56 UTC
As regards the actual bug, we had an i915_error_state on the list that seemed to implicate HWS after resume.

So maybe this would be fixed by reverting a7a75c8f70d6f6a2f16c9f627f938bbee2d32718 as well.
Comment 10 Chris Wilson 2011-03-26 02:09:34 UTC
(In reply to comment #9)
> So maybe this would be fixed by reverting
> a7a75c8f70d6f6a2f16c9f627f938bbee2d32718 as well.

Not it wouldn't. That only affected physical HWS, gm45 is the first gen4 device to use a virtual HWS address.
Comment 11 Chris Wilson 2011-03-26 02:10:46 UTC
Dropping priority, as no progress can be made until we capture some debugging info from the actual bug, and so shouldn't block the release.
Comment 12 Chris Wilson 2012-02-08 12:23:18 UTC
Mass status change to NEEDINFO based on presence of NEEDINFO keyword. Please reopen if you can still reproduce the bug and are able to provide the information requested, thanks.
Comment 13 Chris Wilson 2012-10-21 14:30:05 UTC
Timeout. Please do reopen if you can still reproduce the issue and help us diagnose the problem, thanks.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.