Well, again a hung gpu. dmesg: [41621.520021] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [41621.520268] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [41621.523678] [drm:i915_reset] *ERROR* Failed to reset chip. Xorg log: [ 41621.655] (EE) intel(0): Detected a hung GPU, disabling acceleration. [ 41621.656] (EE) intel(0): When reporting this, please include i915_error_state from debugfs and the full dmesg. [ Switching to framebuffer console and back worked, but display was completely distorted. Restarting xorg without system reboot worked. cu, knut
Created attachment 52665 [details] dmesg
Created attachment 52666 [details] i915_error_state
Created attachment 52667 [details] Xorg log Xorg: git tree, fetched and compiled on october 23, last intel driver commit a18f559961135fa288dda3b94207abb0b6d4d302 Kernel: linux 3.0.7 hardware: AOpen i915GMm-hfs with Pentium M Dothan 1.86 MHz cpu os: opensuse 11.4
batchbuffer at 0x09d43000: 0x09d43000: 0x00000000: MI_NOOP 0x09d43004: 0x00000000: MI_NOOP 0x09d43008: 0x00000000: MI_NOOP 0x09d4300c: 0x00000000: MI_NOOP 0x09d43010: 0x00000000: MI_NOOP 0x09d43014: 0x00000000: MI_NOOP 0x09d43018: 0x00000000: MI_NOOP 0x09d4301c: 0x00000000: MI_NOOP 0x09d43020: 0x00000000: MI_NOOP ... 0x09d43d10: 0x3f800000: UNKNOWN 0x09d43d14: 0x3f800000: UNKNOWN 0x09d43d18: HEAD 0x00000000: MI_NOOP 0x09d43d1c: 0x00000000: MI_NOOP 0x09d43d20: 0x00000000: MI_NOOP 0x09d43d24: 0x00000000: MI_NOOP 0x09d43d28: 0x00000000: MI_NOOP 0x09d43d2c: 0x3f800000: UNKNOWN 0x09d43d30: 0x3f800000: UNKNOWN 0x09d43d34: 0x3f800000: UNKNOWN 0x09d43d38: 0x3f800000: UNKNOWN 0x09d43d3c: 0x3f800000: UNKNOWN 0x09d43d40: 0x00000000: MI_NOOP 0x09d43d44: 0x00000000: MI_NOOP 0x09d43d48: 0x000000aa: MI_NOOP 0x09d43d4c: 0x7d040400: 3DSTATE_LOAD_STATE_IMMEDIATE_1 0x09d43d50: 0x00008266: S6: alpha_test=always, alpha_ref=0x0, depth_test=always, cbuf blend enable, src_blnd_fct=one, dst_blnd_fct=inv_src_alpha, cbuf write enable, tristrip_provoking_vertex=2 0x09d43d54: 0x7c800003: 3DSTATE_SCISSOR_ENABLE enabled 0x09d43d58: 0x7d810001: 3DSTATE_SCISSOR_RECTANGLE 0x09d43d5c: 0x01040009: (9,260) 0x09d43d60: 0x00000000: (0,0) which is a weird mixture of overwritten and stale contents. Doesn't seem to be aligned to fence pitches.
Created attachment 52676 [details] another i915_error_state
Created attachment 52677 [details] another xorg log dmesg: [ 8113.460031] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [ 8113.460269] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [ 8113.462038] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 882756 at 882701, next 882757) [ 8113.462775] [drm:i915_reset] *ERROR* Failed to reset chip. Same server, same kernel. Shall I try to reproduce the problem with some special debug parameters? cu, Knut
@Knut - so this is a regression? If yes, could you please bisect it within the xf86-video-intel driver? But besides that, from the latest X.log, the following seems suspicious: [ 8111.354] [mi] EQ overflowing. The server is probably stuck in an infinite loop. [ 8111.355] Backtrace: [ 8111.355] 0: /usr/bin/Xorg (xorg_backtrace+0x2e) [0x81d5eee] [ 8111.355] 1: /usr/bin/Xorg (mieqEnqueue+0x13e) [0x81b459e] [ 8111.355] 2: /usr/bin/Xorg (QueuePointerEvents+0x5d) [0x809433d] [ 8111.355] 3: /usr/bin/Xorg (xf86PostMotionEventM+0xdd) [0x80cd2dd] [ 8111.356] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb721c000+0x6954) [0xb7222954] [ 8111.356] 5: /usr/bin/Xorg (0x8048000+0x740b1) [0x80bc0b1] [ 8111.356] 6: /usr/bin/Xorg (0x8048000+0x9a484) [0x80e2484] [ 8111.356] 7: (vdso) (__kernel_sigreturn+0x0) [0xb7883400] [ 8111.356] 8: /usr/lib/libdrm_intel.so.1 (drm_intel_gem_bo_map_gtt+0x67) [0xb71fd6a7] [ 8111.356] 9: /usr/lib/xorg/modules/drivers/intel_drv.so (0xb722b000+0x12340) [0xb723d340] [ 8111.356] 10: /usr/lib/xorg/modules/drivers/intel_drv.so (0xb722b000+0x2d5d4) [0xb72585d4] [ 8111.357] 11: /usr/bin/Xorg (0x8048000+0x17cb64) [0x81c4b64] [ 8111.357] 12: /usr/bin/Xorg (0x8048000+0xc8b7f) [0x8110b7f] [ 8111.357] 13: /usr/bin/Xorg (0x8048000+0x2dfe7) [0x8075fe7] [ 8111.357] 14: /usr/bin/Xorg (0x8048000+0x3098f) [0x807898f] [ 8111.357] 15: /usr/bin/Xorg (0x8048000+0x1e26d) [0x806626d] [ 8111.357] 16: /lib/libc.so.6 (__libc_start_main+0xfe) [0xb743fc2e] [ 8113.593] (EE) intel(0): Detected a hung GPU, disabling acceleration. There were lots of changes in X.org input in the past weeks, I wonder if it could come from one of those somehow?
(In reply to comment #7) > @Knut - so this is a regression? If yes, could you please bisect it within the > xf86-video-intel driver? Well, I do not know if it is a regression. Half a year ago I switched off KDE desktop effects because of similar "composite" related problems, see bug #36151. Now I had a little time and tried to reenable them, with little success. Maybe it´s still broken, maybe it´s broken again. > > But besides that, from the latest X.log, the following seems suspicious: > I wonder if this is only one bug. > There were lots of changes in X.org input in the past weeks, I wonder if it > could come from one of those somehow? Well, I´ll try to find a "known good" starting point. cu, Knut
I believe these are all related to the underlying bug: commit c501ae7f332cdaf42e31af30b72b4b66cbbb1604 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Dec 14 13:57:23 2011 +0100 drm/i915: Only clear the GPU domains upon a successful finish By clearing the GPU read domains before waiting upon the buffer, we run the risk of the wait being interrupted and the domains prematurely cleared. The next time we attempt to wait upon the buffer (after userspace handles the signal), we believe that the buffer is idle and so skip the wait. There are a number of bugs across all generations which show signs of an overly haste reuse of active buffers. Such as: https://bugs.freedesktop.org/show_bug.cgi?id=29046 https://bugs.freedesktop.org/show_bug.cgi?id=35863 https://bugs.freedesktop.org/show_bug.cgi?id=38952 https://bugs.freedesktop.org/show_bug.cgi?id=40282 https://bugs.freedesktop.org/show_bug.cgi?id=41098 https://bugs.freedesktop.org/show_bug.cgi?id=41102 https://bugs.freedesktop.org/show_bug.cgi?id=41284 https://bugs.freedesktop.org/show_bug.cgi?id=42141 A couple of those pre-date i915_gem_object_finish_gpu(), so may be unrelated (such as a wild write from a userspace command buffer), but this does look like a convincing cause for most of those bugs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.