Summary: | [snb] GPU hang IPEHR: 0x7a000002 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Jan Alexander Steffens (heftig) <jan.steffens> | ||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||
Status: | VERIFIED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | intel-gfx-bugs, ziktofel | ||||||
Version: | git | ||||||||
Hardware: | Other | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
It appears that the binding table for the source is a stale value (and points before the start of the batch). This is impossible - so perhaps a use-after-free? commit 7df3da10e744d7f168ea3f30b21c434f99beae17 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 29 13:06:08 2014 +0000 sna/gen4+: Assert that the cached binding location is valid We can at least check that it is in the right region (i.e. not past where the current surface has been allocated from). References: https://bugs.freedesktop.org/show_bug.cgi?id=74176 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> That should catch this particular error, but I hope compiling with assertions enabled (--enable-debug) will detect the fault much earlier. Xorg is crashing a lot as well, now. Possibly related? I think the hanging and crashing started with a recent change to xf86-video-intel. Xorg log: [ 11820.893] (EE) Backtrace: [ 11820.893] (EE) 0: /usr/bin/Xorg (xorg_backtrace+0x48) [0x5853a8] [ 11820.893] (EE) 1: /usr/bin/Xorg (0x400000+0x189369) [0x589369] [ 11820.893] (EE) 2: /usr/lib/libpthread.so.0 (0x7fb1e9138000+0xf870) [0x7fb1e9147870] [ 11820.893] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fb1e39d6000+0x289da) [0x7fb1e39fe9da] [ 11820.893] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fb1e39d6000+0xca5db) [0x7fb1e3aa05db] [ 11820.893] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fb1e39d6000+0xcad96) [0x7fb1e3aa0d96] [ 11820.893] (EE) 6: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fb1e39d6000+0xce2e6) [0x7fb1e3aa42e6] [ 11820.893] (EE) 7: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fb1e39d6000+0x5b8b2) [0x7fb1e3a318b2] [ 11820.893] (EE) 8: /usr/bin/Xorg (0x400000+0x105d31) [0x505d31] [ 11820.893] (EE) 9: /usr/bin/Xorg (0x400000+0x35f8e) [0x435f8e] [ 11820.893] (EE) 10: /usr/bin/Xorg (0x400000+0x39d9a) [0x439d9a] [ 11820.893] (EE) 11: /usr/lib/libc.so.6 (__libc_start_main+0xf5) [0x7fb1e7dabb05] [ 11820.893] (EE) 12: /usr/bin/Xorg (0x400000+0x2533e) [0x42533e] [ 11820.893] (EE) [ 11820.893] (EE) Segmentation fault at address 0x18 The lines from intel_drv.so are, in order: src/intel_list.h:161 src/sna/gen6_render.c:1072 src/sna/gen6_render.c:2989 src/sna/gen6_render.c:3114 src/sna/sna_composite.c:987 I'll also compile with debug, next. Yes, that smells of the same bo use-after-free. :( Created attachment 92996 [details]
gdb log
Caught an assertion.
D'oh, that was silly. commit d70620d9789da1cf983dac318d9ca9149f11ff20 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 29 13:39:20 2014 +0000 sna: We can only retire a bo if is not referenced by the current batch Fixes regression from commit 8b0ebebcab21647348f769c25ca0c1d81d169e75 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jan 28 16:30:47 2014 +0000 sna: Be a little more assertive in retiring after set-domain Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74176 Reported-by: Jan Alexander Steffens <jan.steffens@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> That should set everything back to normal... Yep, I think that did it. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 92991 [details] /sys/class/drm/card0/error Arch Linux x86_64 Thinkpad X220 (SNB) Kernel: 3.13.0 Mesa: 10.0.2 Xorg: 1.15.0 xf86-video-intel: 2.99.907-62-g872468a Using GNOME Shell and Firefox. dmesg: [ 8675.044652] [drm] stuck on render ring [ 8675.044660] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 8675.044661] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 8675.044663] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 8675.044664] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 8675.044665] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 8675.047739] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x741000 ctx 0) at 0x742268 Error state attached.