Summary: | xorg crashes after update to 1.8.0 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Fryderyk Dziarmagowski <fdziarmagowski> | ||||||||||
Component: | Driver/intel | Assignee: | Kristian Høgsberg <krh> | ||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||
Severity: | major | ||||||||||||
Priority: | medium | CC: | chris, mrgrim, remi, rmcauley, tsdh, vcunat | ||||||||||
Version: | 7.5 (2009.10) | ||||||||||||
Hardware: | x86 (IA32) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Created attachment 34722 [details]
simple setup
removing Mesa dri driver (i965_dri.so) brings stability to my system back (Xorg server crash is no more). Even mentioned hard lock goes away (a deep one, even nmi watchdog does not help). The easiest way to reproduce the crash is "full screen preview" in gnome-screensaver (a 3d one) I suffer from the same problem since when I update to the xf86-video-intel-2.11.0 driver. After downgrading to 2.10.0, the system is stable again. The bug has occured for some Gentoo users. See the bugs http://bugs.gentoo.org/show_bug.cgi?id=314935 http://bugs.gentoo.org/show_bug.cgi?id=310829 Please also attach the output of dmesg after the first crash. Thanks (In reply to comment #4) > I suffer from the same problem since when I update to the > xf86-video-intel-2.11.0 driver. After downgrading to 2.10.0, the system is > stable again. > > The bug has occured for some Gentoo users. See the bugs > > http://bugs.gentoo.org/show_bug.cgi?id=314935 > http://bugs.gentoo.org/show_bug.cgi?id=310829 First one is a gpu lock up, seems unrelated to this bug (xserver crash) The second one perfectly matches "hard lockup" I've mentioned before, but it should be a separate bug (I'm about to open a new soon) reported the freeze: https://bugs.freedesktop.org/show_bug.cgi?id=27647 Downgrading to xf86-video-intel-2.10.x solves the problem Unfortunately 1.8.0.902 still crashes with xf86-video-intel-2.11.0 Some related changes are present in Fedora 13 already (mentioned in #27767#c8) but I not sure what they fixed (driver? xorg?) Created attachment 35677 [details] [review] Check if !clientGone before writing swap event With the patch from #10 the issue is still present. This time I was able to catch two different traces: first catch: Thread 1 (Thread 0xb77969d0 (LWP 7632)): #0 0xffffe424 in __kernel_vsyscall () #1 0x4feae225 in __libc_writev (fd=<value optimized out>, vector=<value optimized out>, count=<value optimized out>) at ../sysdeps/unix/sysv/linux/writev.c:51 #2 0x0809e216 in _XSERVTransSocketWritev (ciptr=0xa639ac0, buf=0xbfb9b108, size=1) at /usr/include/X11/Xtrans/Xtranssock.c:2153 #3 0x0809da7c in _XSERVTransWritev (ciptr=0xa639ac0, buf=0xbfb9b108, size=1) at /usr/include/X11/Xtrans/Xtrans.c:912 #4 0x080a5963 in FlushClient (who=0xa8e6870, oc=0xa783188, __extraBuf=0x0, extraCount=0) at io.c:898 #5 0x0809c885 in CloseDownConnection (client=0xa8e6870) at connection.c:1037 #6 0x08068057 in CloseDownClient (client=0xa8e6870) at dispatch.c:3602 #7 0x0806d585 in Dispatch () at dispatch.c:450 #8 0x080667e5 in main (argc=9, argv=0xbfb9b334, envp=0xbfb9b35c) at main.c:286 (gdb) thread apply all bt second catch Thread 1 (Thread 0xb77979d0 (LWP 23996)): #0 WriteToClient (who=0xb6113b8, count=32, __buf=0xbfbd824c) at io.c:702 #1 0x0807c0ce in WriteEventsToClient (pClient=0xb6113b8, count=1, events=0xbfbd824c) at events.c:5774 #2 0xb779dc04 in DRI2SwapEvent (client=0xb6113b8, data=0xb613bb8, type=2, ust=1273994865087416, msc=218032, sbc=182) at dri2ext.c:372 #3 0xb779d136 in DRI2SwapComplete (client=0xb6113b8, pDraw=0xb613bb8, frame=218032, tv_sec=1273998860, tv_usec=909609, type=2, swap_complete=0xb779db6f <DRI2SwapEvent>, swap_data=0xb613bb8) at dri2.c:573 #4 0xb77153d7 in I830DRI2FrameEventHandler (frame=218032, tv_sec=1273998860, tv_usec=909609, event_data=0xb513530) at i830_dri.c:562 #5 0xb7710bfe in drmmode_vblank_handler (fd=8, frame=218032, tv_sec=1273998860, tv_usec=909609, event_data=0xb513530) at drmmode_display.c:1400 #6 0x4f3659d6 in drmHandleEvent (fd=<value optimized out>, evctx=<value optimized out>) at xf86drmMode.c:776 #7 0xb7710b62 in drm_wakeup_handler (data=0x98c1140, err=2, p=0x81e61a0) at drmmode_display.c:1425 #8 0x080798bc in WakeupHandler (result=2, pReadmask=0x81e61a0) at dixutils.c:403 #9 0x080a4d77 in WaitForSomething (pClientsReady=0xb1f0910) at WaitFor.c:232 #10 0x0806d42e in Dispatch () at dispatch.c:375 #11 0x080667e5 in main (argc=9, argv=0xbfbd8c64, envp=0xbfbd8c8c) at main.c:286 and this is the bitter end of a Xorg server (1.8.1) (gdb) continue Continuing. Program received signal SIGABRT, Aborted. 0xffffe424 in __kernel_vsyscall () (gdb) bt #0 0xffffe424 in __kernel_vsyscall () #1 0x4fe09e19 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x4fe0b48c in *__GI_abort () at abort.c:92 #3 0x080a1e66 in OsAbort () at utils.c:1321 #4 0x080aa767 in ddxGiveUp () at xf86Init.c:1238 #5 0x080aa835 in AbortDDX () at xf86Init.c:1284 #6 0x0809b45e in AbortServer () at log.c:418 #7 0x0809ba7e in FatalError (f=0x81b15f4 "Caught signal %d (%s). Server aborting\n") at log.c:546 #8 0x0809b0b8 in OsSigHandler (signo=11, sip=0xbfbd7dcc, unused=0xbfbd7e4c) at osinit.c:156 #9 <signal handler called> #10 WriteToClient (who=0xb6113b8, count=32, __buf=0xbfbd824c) at io.c:702 #11 0x0807c0ce in WriteEventsToClient (pClient=0xb6113b8, count=1, events=0xbfbd824c) at events.c:5774 #12 0xb779dc04 in DRI2SwapEvent (client=0xb6113b8, data=0xb613bb8, type=2, ust=1273994865087416, msc=218032, sbc=182) at dri2ext.c:372 #13 0xb779d136 in DRI2SwapComplete (client=0xb6113b8, pDraw=0xb613bb8, frame=218032, tv_sec=1273998860, tv_usec=909609, type=2, swap_complete=0xb779db6f <DRI2SwapEvent>, swap_data=0xb613bb8) at dri2.c:573 #14 0xb77153d7 in I830DRI2FrameEventHandler (frame=218032, tv_sec=1273998860, tv_usec=909609, event_data=0xb513530) at i830_dri.c:562 #15 0xb7710bfe in drmmode_vblank_handler (fd=8, frame=218032, tv_sec=1273998860, tv_usec=909609, event_data=0xb513530) at drmmode_display.c:1400 #16 0x4f3659d6 in drmHandleEvent (fd=<value optimized out>, evctx=<value optimized out>) at xf86drmMode.c:776 #17 0xb7710b62 in drm_wakeup_handler (data=0x98c1140, err=2, p=0x81e61a0) at drmmode_display.c:1425 #18 0x080798bc in WakeupHandler (result=2, pReadmask=0x81e61a0) at dixutils.c:403 #19 0x080a4d77 in WaitForSomething (pClientsReady=0xb1f0910) at WaitFor.c:232 #20 0x0806d42e in Dispatch () at dispatch.c:375 #21 0x080667e5 in main (argc=9, argv=0xbfbd8c64, envp=0xbfbd8c8c) at main.c:286 *** Bug 28391 has been marked as a duplicate of this bug. *** Created attachment 36353 [details] [review] Check for null client->osPrivate in DRI2 Here's an almost certainly wrong patch that works for me. It's based on the observation that WriteToClient is SIGSEGVing when accessing osPrivate->output since who->osPrivate is NULL in my backtraces. I'm not yet sure where this bad ClientPtr comes from. Bah. Should read the bug more thoroughly before wandering through code. The !clientGone patch fixes the crash here. Tested-By: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Is the patch still required after the following commit? (I think on consistency grounds that either all dri2 functions check for clientGone or none do.) commit 660f6ab5494a728c3ca7ba00c305e9ff06c8ecb2 Author: Simon Farnsworth <simon.farnsworth@onelan.com> Date: Tue Jun 22 10:13:30 2010 +0100 Don't crash when asked if a client that has disconnected was local ProcDRI2Dispatch uses LocalClient to determine if it's safe to respond to a client that has made DRI2 requests which aren't sensible for remote clients (anything but version). When the client has disappeared mid-request stream (e.g. as a result of a kill -9, or a client-side bug), LocalClient causes the X server to follow suit, as ((OsCommPtr)client->osPrivate)->trans_conn is NULL at this point. The simple and obvious fix is to just return "not local" when trans_conn is NULL, which fixes the crash I was seeing; however Keith Packard pointed out that just checking trans_conn isn't enough; quoting Keith: "This looks almost right to me -- I reviewed the os code to see when _XSERVTransClose is called (which is what frees the trans_conn data) and found that every place which called that immediately set trans_conn to NULL, except for the call in CloseDownFileDescriptor which is only called from CloseDownConnection and which is immediately followed by freeing the OsCommRec and setting client->osPrivate to NULL. So, I'd suggest checking client->osPrivate in addition to the above check." h It looks like that commit should obsolete the patches. I'll build 1.9RC4 and check. The commit 660f6ab5494a728c3ca7ba00c305e9ff06c8ecb2 does fix this without the need for any further patch. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 34721 [details] Xorg.0.log How to reproduce: switch between 3d screensavers in gnome-screensaver preferences dialog Program received signal SIGSEGV, Segmentation fault. WriteToClient (who=0xa63e6b8, count=32, __buf=0xbf944a1c) at io.c:702 702 ConnectionOutputPtr oco = oc->output; (gdb) thread apply all bt Thread 1 (Thread 0xb76ec9d0 (LWP 2861)): #0 WriteToClient (who=0xa63e6b8, count=32, __buf=0xbf944a1c) at io.c:702 #1 0x0807bff8 in WriteEventsToClient (pClient=0xa63e6b8, count=1, events=0xbf944a1c) at events.c:5770 #2 0xb7687897 in DRI2SwapEvent (client=0xa63e6b8, data=0xa31eb48, type=2, ust=1270574518783976, msc=27450, sbc=4658) at dri2ext.c:369 #3 0xb7686d69 in DRI2SwapComplete (client=0xa63e6b8, pDraw=0xa31eb48, frame=27450, tv_sec=1270574518, tv_usec=783976, type=2, swap_complete=0xa63e6b8, swap_data=0xa31eb48) at dri2.c:546 #4 0xb7659247 in I830DRI2FrameEventHandler (frame=27450, tv_sec=1270574518, tv_usec=783976, event_data=0xa8b6820) at i830_dri.c:562 #5 0xb7654a5a in drmmode_vblank_handler (fd=8, frame=27450, tv_sec=1270574518, tv_usec=783976, event_data=0xa8b6820) at drmmode_display.c:1400 #6 0xb76769d6 in drmHandleEvent (fd=8, evctx=0x8c2a140) at xf86drmMode.c:776 #7 0xb76549be in drm_wakeup_handler (data=0x8c2a130, err=1, p=0x81e6120) at drmmode_display.c:1425 #8 0x0807980c in WakeupHandler (result=1, pReadmask=0x81e6120) at dixutils.c:403 #9 0x080a4c17 in WaitForSomething (pClientsReady=0xa2f36a0) at WaitFor.c:232 #10 0x0806d3de in Dispatch () at dispatch.c:375 #11 0x08066795 in main (argc=9, argv=0xbf945414, envp=0xbf94543c) at main.c:286 setup: Integrated Graphics Chipset: Intel(R) G45/G43 xserver 1.8.0 Mesa 7.8.0 intel driver 2.11.0 linux 2.6.33.2 there is nothing special in Xorg.0.log, nothing in kernel log and i915_error_state stays calm. I'm not sure it is related but since upgrade I got hard lock ups :( hard days for intel users... (well, see my other bugs for details ;)