Summary: | [945gm] Display B: Invalid GTT PTE (enable plane too early?) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Bryce Harrington <bryce> | ||||||||||
Component: | DRM/Intel | Assignee: | Daniel Vetter <daniel> | ||||||||||
Status: | CLOSED FIXED | QA Contact: | |||||||||||
Severity: | normal | ||||||||||||
Priority: | medium | CC: | ben, chris, daniel, eugeni, florian, freedesktop-bugzilla, jbarnes, przanoni | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | x86 (IA32) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Bryce Harrington
2011-08-08 19:11:45 UTC
Created attachment 50048 [details]
BootDmesg.txt
Created attachment 50049 [details]
CurrentDmesg.txt
Created attachment 50050 [details]
XorgLog.txt
Created attachment 50051 [details]
i915_error_state.txt
They are still bugs, in some ways much more frightening than performing an undefined operation - the chip has detected that we are accessing invalid memory. Who knows what illegal accesses we did before the invalid access! The trick to determine if the GPU is truly wedged would be to cat /sys/kernel/debug/dri/0/i915_wedged (or you can try issuing a throttle command and look for an EIO error code). From what I've seen, most of the false gpu hang reports have a hang which occurs late during boot, basically right at the point that the drm driver is loaded. Could the issue be that some memory is not being initialized, or a race condition in initialization? Do you have an idea if this problem is unique to Ubuntu? I'm wondering if it boils down to some boot optimization we did ourselves, or if it is a legitimate bug in the driver? (In reply to comment #6) > Do you have an idea if this problem is unique to Ubuntu? I'm wondering if it > boils down to some boot optimization we did ourselves, or if it is a legitimate > bug in the driver? Don't know if it will help, but I haven't seen such issues in Mandriva/Mageia while maintaining their mesa/X/init stacks. At the same time, we have seen similar issues when booting Ubuntu on same hardware for reference. Don't know if it is a coincidence (as compile flags, versions and so on do not match always), but Ubuntu was the only one to show this. But I admit that I could be wrong, and I certainly haven't tested it in-depth. I lowered the priority a bit to have it in the same priority scale as other false GPU lockups. So Display B is unbound but enabled... Big time modesetting screwup. Shouldn't the sanitize function have disabled the planes?? If so this should be fixed right? Well, we're still seeing false lockups, although not exactly the same set of error codes as this bug. [i915gm] False GPU lockup EIR: 0x00000010 PGTBL_ER: 0x00000010 render.IPEHR: 0x01000000 https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/981171 [IGDgm] False GPU lockup EIR: 0x00000010 PGTBL_ER: 0x00010000 render.IPEHR: 0x01000000 https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/978968 (+4 dupes) [i965gm] GPU lockup EIR: 0x00000010 PGTBL_ER: 0x00000100 https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/982021 [gm45] GPU lockup EIR: 0x00000010 PGTBL_ER: 0x00100000 https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/981297 The latter two sound like actual misbehaviors happened. Would you prefer I file new upstream reports on each of these, or do they seem like the same issue? Btw, for comparison, there were 142 bugs collected last cycle as dupes of this: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/828684 My old favourite: diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_d index e0e8cb5..7978e41 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5846,7 +5846,6 @@ static int i9xx_crtc_mode_set(struct drm_crtc *crtc, I915_WRITE(DSPCNTR(plane), dspcntr); POSTING_READ(DSPCNTR(plane)); - intel_enable_plane(dev_priv, plane, pipe); ret = intel_pipe_set_base(crtc, x, y, old_fb); These bugs all have similar symptoms that could be explained and fixed by the following patch. So please do test drm-intel-next-queued and report back. On trying the equivalent patch in the past, it has caused modesetting regression for the initial switch from the BIOS configuration, so do look out for any glitches during boot. Thanks. commit 969d380a39d33f7533b6dcee35e834109d23f9e9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 24 16:36:50 2012 +0100 drm/i915: Remove too early plane enable on pre-PCH hardware Enabling the plane before we have assigned valid address means that it will access random PTE (often with conflicting memory types) and cause GPU lockups. However, enabling the plane too early appears to workaround a number of bugs in our modesetting code. Cc: Franz Melchior <melchior.franz@gmail.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=39947 References: https://bugs.freedesktop.org/show_bug.cgi?id=41091 References: https://bugs.freedesktop.org/show_bug.cgi?id=49041 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> A patch referencing this bug report has been merged in Linux v3.5-rc1: commit c7bd4c25650704d4d065eb4ce2a122d2a80ce804 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 24 16:36:50 2012 +0100 drm/i915: Remove too early plane enable on pre-PCH hardware |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.