Bug 48825 - [sandybridge-m-gt2+] GPU lockup render.IPEHR: 0xd208b6d0
Summary: [sandybridge-m-gt2+] GPU lockup render.IPEHR: 0xd208b6d0
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Daniel Vetter
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-17 09:59 UTC by Bryce Harrington
Modified: 2017-07-24 23:02 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
BootDmesg.txt (89.06 KB, text/plain)
2012-04-17 10:00 UTC, Bryce Harrington
no flags Details
CurrentDmesg.txt (7.20 KB, text/plain)
2012-04-17 10:00 UTC, Bryce Harrington
no flags Details
i915_error_state.txt (2.05 MB, text/plain)
2012-04-17 10:00 UTC, Bryce Harrington
no flags Details
XorgLog.txt (67.53 KB, text/plain)
2012-04-17 10:00 UTC, Bryce Harrington
no flags Details

Description Bryce Harrington 2012-04-17 09:59:50 UTC
Forwarding this bug from Ubuntu reporter Jose Blanca:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/983146

[Problem]
Freeze and sometimes corruption.  Occurs several times a day without an identified pattern.  Began immediately after a fresh install of ubuntu precise.

[Original Description]
Since the precise installation the system is having problem with the video display. Sometimes there's corruption in the screen and sometimes random hangs of the X server. Usually the computer can be recovered by restarting lightgdm and rebooting afterwards.
The computer is a macmini (mid 2011 model) and precise (beta 2) has been installed without using bootcamp (no BIOS emulation).

ProblemType: Crash
DistroRelease: Ubuntu 12.04
Package: xserver-xorg-video-intel 2:2.17.0-1ubuntu4
ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14
Uname: Linux 3.2.0-23-generic x86_64
.tmp.unity.support.test.0:
 
ApportVersion: 2.0.1-0ubuntu3
Architecture: amd64
Chipset: sandybridge-m-gt2+
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: None
Date: Mon Apr 16 16:59:48 2012
DistUpgraded: Fresh install
DistroCodename: precise
DistroVariant: ubuntu
DuplicateSignature: [sandybridge-m-gt2+] GPU lockup  render.IPEHR: 0xd208b6d0 Ubuntu 12.04
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in Ubuntu
GpuHangFrequency: Several times a day
GpuHangReproducibility: Seems to happen randomly
GpuHangStarted: Immediately after installing this version of Ubuntu
GraphicsCard:
 Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09) (prog-if 00 [VGA controller])
   Subsystem: Apple Inc. Device [106b:00e6]
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta amd64 (20120328)
InterpreterPath: /usr/bin/python2.7
MachineType: Apple Inc. Macmini5,1
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:
 
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic root=UUID=5228f36e-20df-43f4-8707-a6f7325e7535 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 xserver-xorg             1:7.6+12ubuntu1
 libdrm2                  2.4.32-1ubuntu1
 xserver-xorg-video-intel 2:2.17.0-1ubuntu4
SourcePackage: xserver-xorg-video-intel
Title: [sandybridge-m-gt2+] GPU lockup  render.IPEHR: 0xd208b6d0
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
 
dmi.bios.date: 01/24/2012
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MM51.88Z.0077.B10.1201241549
dmi.board.asset.tag: Base Board Asset Tag#
dmi.board.name: Mac-8ED6AF5B48C039E1
dmi.board.vendor: Apple Inc.
dmi.board.version: Macmini5,1
dmi.chassis.type: 16
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-8ED6AF5B48C039E1
dmi.modalias: dmi:bvnAppleInc.:bvrMM51.88Z.0077.B10.1201241549:bd01/24/2012:svnAppleInc.:pnMacmini5,1:pvr1.0:rvnAppleInc.:rnMac-8ED6AF5B48C039E1:rvrMacmini5,1:cvnAppleInc.:ct16:cvrMac-8ED6AF5B48C039E1:
dmi.product.name: Macmini5,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.
version.compiz: compiz 1:0.9.7.6-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.32-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 8.0.2-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 8.0.2-0ubuntu3
version.xserver-xorg-core: xserver-xorg-core 2:1.11.4-0ubuntu10
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.0-0ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20111219.aacbd629-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.17.0-1ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20111201+b5534a1-1build2
Comment 1 Bryce Harrington 2012-04-17 10:00:07 UTC
Created attachment 60180 [details]
BootDmesg.txt
Comment 2 Bryce Harrington 2012-04-17 10:00:18 UTC
Created attachment 60181 [details]
CurrentDmesg.txt
Comment 3 Bryce Harrington 2012-04-17 10:00:30 UTC
Created attachment 60182 [details]
i915_error_state.txt
Comment 4 Bryce Harrington 2012-04-17 10:00:42 UTC
Created attachment 60183 [details]
XorgLog.txt
Comment 5 Chris Wilson 2012-04-17 10:16:28 UTC
I'm going to say

commit c501ae7f332cdaf42e31af30b72b4b66cbbb1604
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Dec 14 13:57:23 2011 +0100

    drm/i915: Only clear the GPU domains upon a successful finish
    
    By clearing the GPU read domains before waiting upon the buffer, we run
    the risk of the wait being interrupted and the domains prematurely
    cleared. The next time we attempt to wait upon the buffer (after
    userspace handles the signal), we believe that the buffer is idle and so
    skip the wait.
    
    There are a number of bugs across all generations which show signs of an
    overly haste reuse of active buffers.

and then you say "But that kernel already contains a backport of that patch"... :)
Comment 6 Bryce Harrington 2012-04-17 12:58:54 UTC
Okay.  "But that kernel already contains a backport of that patch, as commit 89df7051:"

commit 89df7051aab3884a734fc1eb666322643519c9c8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Dec 14 13:57:23 2011 +0100
Comment 7 Chris Wilson 2012-04-17 16:40:44 UTC
I don't recognise the repeating here, and another worrying aspect is the IPEHR != *ACTHD. Having identified the mesa/i965 XY_COLOR_BLT issue, this is likely to be another occurrence... Hopefully someone else may have some insight.
Comment 8 Chris Wilson 2012-04-18 06:06:49 UTC
Daniel spotted there is evidence of outstanding_lazy_request errors in that error-state, fixed by:

commit 53d227f282eb9fa4c7cdbfd691fa372b7ca8c4c3
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jan 25 16:32:49 2012 +0100

    drm/i915: fixup seqno allocation logic for lazy_request
    
    Currently we reserve seqnos only when we emit the request to the ring
    (by bumping dev_priv->next_seqno), but start using it much earlier for
    ring->oustanding_lazy_request. When 2 threads compete for the gpu and
    run on two different rings (e.g. ddx on blitter vs. compositor)
    hilarity ensued, especially when we get constantly interrupted while
    reserving buffers.
Comment 9 Daniel Vetter 2012-04-18 06:09:11 UTC
Ok, some more error-state analysis shows that
- last signalled seqno on the blt ring is 0x00244f73
- we have an object on the render ring with that seqno:
  0cc57000     4096 0006 0000 00244f73 dirty render snooped (LLC)
- the blt ring is filled with MI_FLUSH_DW

I blame the seqno confusion for that one. Please retest with

commit 53d227f282eb9fa4c7cdbfd691fa372b7ca8c4c3
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jan 25 16:32:49 2012 +0100

    drm/i915: fixup seqno allocation logic for lazy_request

which is merged into 3.4-rc1.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.