Bug 36617 - Clarkdale GPU hung - potential 2.6.38 regression
Summary: Clarkdale GPU hung - potential 2.6.38 regression
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 7.5 (2009.10)
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-26 14:31 UTC by Martin
Modified: 2011-04-27 11:28 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
archive containing /sys/kernel/debug/dri/0/ (59.32 KB, application/x-xz)
2011-04-26 14:38 UTC, Martin
no flags Details

Description Martin 2011-04-26 14:31:28 UTC
there have been a few reports on LKML about a suspected 2.6.38 regression. For me it is very sporadic but I've just collected the following in the syslog:

Apr 26 22:58:56 arnold kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Apr 26 22:58:56 arnold kernel: [drm:init_ring_common] *ERROR* failed to set render ring head to zero ctl 00000000 head 2da1273c tail 00000000 start 00001000
Apr 26 22:58:56 arnold kernel: [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f003 head 2da1273c tail 00000000 start 00001000
Apr 26 23:00:28 arnold kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung

X had crashed but could be restarted. Desktop effects were dodgy and a full reboot was required.

I have saved the full /sys/kernel/debug/dri/0/ directory, of which I am going to attach i915_error_state for now.

Chipset is Clarkdale with a i3 530 CPU. Kernel version is 2.6.38.4.
Comment 1 Martin 2011-04-26 14:38:34 UTC
Created attachment 46099 [details]
archive containing /sys/kernel/debug/dri/0/

Since the uncompressed i915_error_state is way too large I have attached the compressed debugfs archive instead.
Comment 2 Chris Wilson 2011-04-27 00:28:18 UTC
Dormant bug since the dawn of time in the ddx. Finally fixed in 2.15.0:

commit 23f9b14df7c102c1036134835dd5d1a508059858
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Feb 12 10:42:34 2011 +0000

    i965: Remove broken maximum base addresses from video
    
    WRONG.
    
    The hardware was never limited to 0x1000000 and the kernel can quite
    rightly place objects above that limit. Specifying such had no relation
    to reality, so why did we do it? TWICE!
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34017
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 3 Martin 2011-04-27 11:28:06 UTC
Thanks, upgrading now.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.