I just got this UXA hang on Gigabyte GA-EG45M-DS2H (G45) using: vanilla upstream 2.6.29-020629rc5-generic libdrm-intel1 and libdrm2 is 2.4.4-0ubuntu6 xserver-xorg-video-intel is 2:2.6.1-1ubuntu2 libgl1-mesa-dev is 7.3-1ubuntu1 I can move the mouse but the mouse cursor is no longer not animating. I could still ssh into the box and capture logs (attaching).
Created attachment 23023 [details] gdb backtrace
Created attachment 23024 [details] dmesg (while hung)
Created attachment 23025 [details] gem_objects (while hung)
Created attachment 23026 [details] intel_reg_dumper (while hung)
Created attachment 23027 [details] i915_gem_interrupt (while hung)
Created attachment 23028 [details] xorg conf (while hung)
Created attachment 23029 [details] xorg log (while hung)
Created attachment 23030 [details] xorg.log.old (while hung)
Because the dmesg has this weird message about stack trap blah in compiz.real just before the crash happened I also captured the compiz.real stack: (gdb) info threads 1 Thread 0x7f2256f58750 (LWP 9297) 0x00007f2254df16f3 in __select_nocancel () from /lib/libc.so.6 (gdb) bt full #0 0x00007f2254df16f3 in __select_nocancel () from /lib/libc.so.6 No symbol table info available. #1 0x00007f225378f19e in _xcb_conn_wait (c=0x23eef40, cond=<value optimized out>, vector=0x0, count=0x0) at /build/buildd/libxcb-1.1.93/./src/xcb_conn.c:283 ret = 0 rfds = {__fds_bits = {16, 0 <repeats 15 times>}} wfds = {__fds_bits = {0 <repeats 16 times>}} #2 0x00007f2253790c8c in xcb_wait_for_reply (c=0x23eef40, request=984088, e=0x7fff5ef86c28) at /build/buildd/libxcb-1.1.93/./src/xcb_in.c:376 cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\0' <repeats 47 times>, __align = 0} reader = {request = 984088, data = 0x7fff5ef86ba0, next = 0x0} prev_reader = (reader_list **) 0x23efff8 widened_request = <value optimized out> ret = (void *) 0x0 #3 0x00007f2254a55fbc in _XReply (dpy=0x23ee500, rep=0x7fff5ef86c70, extra=0, discard=0) at ../../src/xcb_io.c:454 error = <value optimized out> c = (xcb_connection_t *) 0x23eef40 __PRETTY_FUNCTION__ = "_XReply" #4 0x00007f2255353ce2 in DRI2CopyRegion () from /usr/lib/libGL.so.1 No symbol table info available. #5 0x00007f2255353a3f in ?? () from /usr/lib/libGL.so.1 No symbol table info available. #6 0x00007f225532f2fb in ?? () from /usr/lib/libGL.so.1 No symbol table info available. #7 0x00000000004123bf in eventLoop () No symbol table info available. #8 0x000000000040d451 in main () No symbol table info available.
Is there a reproducible steps for this?
No repro steps yet.
this information shows pretty much the generic "the chip is hung" state -- not much to do without steps to reproduce.
Thanks for having a look Eric. Gordon, what's your policy? Should it be marked as "NEEDINFO" asking for a repro or do you prefer to close it? Question: Is it correct that once the chip hangs xorg can end up in many different stacks depending on what exactly xorg was doing _when_ the chip hung? I'm asking because I implicitly assumed that different stacks meant it was different bugs but when I think about it, that doesn't feel like a solid assumption when the code involves a GPU (I'm used to doing mostly user space apps).
there are piles of different stacktraces you could have when apps ended up waiting for the gpu to finish some task that it didn't.
(In reply to comment #13) > Gordon, what's your policy? Should it be marked as "NEEDINFO" asking for a > repro or do you prefer to close it? No. I can't mark "NEEDINFO" to force you to provide some info which you've answered you can't provide. We also can't provide a bug just because there's no steady reproducible step. So I think we should just leave this bug open, but it will probably be lower priority from developer's point of view, since there's no clear info.
I'm getting this hang, too, on an X3100 and I get a similar backtrace. It usually happens right when I click a window, and it's being redrawn with the 'active' appearance. But it's rare and random.
Hmm, this isn't good... I got a freeze with exa while running Google Earth. Frame 1 was in drm_intel_gem_bo_start_gtt_access.
(In reply to comment #0) > I can move the mouse but the mouse cursor is no longer not animating. I could > still ssh into the box and capture logs (attaching). same applys for 915GM. can move the mouse, ssh on the box but not reboot. kernel: ubuntu-jaunty 2.6.28-11-generic libdrmm 2.4.5 xserver-xorg-video-intel-dbg 2.6.3 mesa 7.3 backtrace starts in drm_intel_gem_bo_start_gtt_access either.
Brian: If you're reliably getting a hang from googleearth, please open your own bug for that issue so we can track and fix it. Everyone else: If you're looking at hopping in on this bug with "me too", please just open your own bug if you've got something specific you can do ("run this app, click this, go to this location", not "use the desktop for an hour") to reproduce. Just because the backtrace is the same doesn't mean the cause is the same, and your own bug means individual attention to your problem.
Adjusting severity: crashes & hangs should be marked critical.
If you're still experiencing this, could you use intel_gpu_dump on 2.6.30rc4 or newer when it's hung so we can look at what we did that angered the GPU? Also, note that there are some fixes in git master of the 2D driver that may help with GPU hangs.
Martin, we have many fixes for such gpu hang in the latest xf86-video-intel driver and kernel. Could you try that? If it still exists, please provide intel_gpu_dump according to http://intellinuxgraphics.org/intel-gpu-dump.html.
When I first opened this bug I didn't know how hard it is to do something useful with a bug report for a GPU hang that lacks buffer dumps. I remember not seeing this bug again for at least a few weeks after I reported it (I was Ubuntu devel release back then so my bits changed quite a lot). For the last few weeks though (and also up until end of Aug) my intel G45 box will packed away be in a moving box in a storage facility. My suggestion is to close this bug report. If I get another hang, I will capture buffers and open a new bug.
closing
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.