If I run compiz for more than 5 minutes, Xorg will almost certainly crash. This only seems to be true for the first Xorg session after the machine has been started. I have not seen another crash after the initial one, despite running compiz. A rather unhelpful (to me) backtrace is reproduced below. GDB session: (gdb) bt #0 0x00007f529a7948e3 in select () from /lib/libc.so.6 #1 0x0000000000478fb9 in WaitForSomething (pClientsReady=0x46d27b0) at WaitFor.c:230 #2 0x0000000000455048 in Dispatch () at dispatch.c:362 #3 0x00000000004268be in main (argc=8, argv=0x7fff39890158, envp=0x7fff398901a0) at main.c:283 (gdb) print MaxClients $1 = 256 (gdb) print LastSelectMask $2 = {fds_bits = {2251799813554858, 0 <repeats 15 times>}} (gdb) print wt $3 = (struct timeval *) 0x7fff3988ff40 (gdb) print *wt $4 = {tv_sec = 0, tv_usec = 252709} (gdb) list 225 XFD_COPYSET(&ClientsWriteBlocked, &clientsWritable); 226 i = Select (MaxClients, &LastSelectMask, &clientsWritable, NULL, wt); 227 } 228 else 229 { 230 i = Select (MaxClients, &LastSelectMask, NULL, NULL, wt); 231 } 232 selecterr = GetErrno(); 233 WakeupHandler(i, (pointer)&LastSelectMask); 234 SmartScheduleStartTimer (); (gdb) c Continuing. Program received signal SIGABRT, Aborted. 0x00007f529a6e9025 in raise () from /lib/libc.so.6 (gdb) bt #0 0x00007f529a6e9025 in raise () from /lib/libc.so.6 #1 0x00007f529a6eac33 in abort () from /lib/libc.so.6 #2 0x0000000000493e5c in ddxGiveUp () at xf86Init.c:1397 #3 0x0000000000493f5c in AbortDDX () at xf86Init.c:1442 #4 0x000000000046e095 in AbortServer () at log.c:404 #5 0x000000000046e5a3 in FatalError (f=0x616c80 "Caught signal %d (%s). Server aborting\n") at log.c:529 #6 0x000000000047fc06 in OsSigHandler (signo=3, sip=0x7fff3988f9f0, unused=0x7fff3988f8c0) at osinit.c:152 #7 <signal handler called> #8 0x00007f529a7948e3 in select () from /lib/libc.so.6 #9 0x0000000000478fb9 in WaitForSomething (pClientsReady=0x46d27b0) at WaitFor.c:230 #10 0x0000000000455048 in Dispatch () at dispatch.c:362 #11 0x00000000004268be in main (argc=8, argv=0x7fff39890158, envp=0x7fff398901a0) at main.c:283
Preliminary systemtap results seem to indicate the signal is from keventd. Here is the siginfo: Breakpoint 1, OsSigHandler (signo=3, sip=0x7fff9372d1b0, unused=0x7fff9372d080) at osinit.c:127 127 if (OsSigWrapper != NULL) { (gdb) print sip $1 = (siginfo_t *) 0x7fff9372d1b0 (gdb) print *sip $2 = {si_signo = 3, si_errno = 0, si_code = 128, _sifields = {_pad = {0, 0, 4439628, 0, -1, 250, -173712783, 2, 2097155, 1, -1821191168, 32767, 76704400, 0, 95228656, 0, 76704400, 0, 3, 1, 0, 0, 76704400, 0, -1821191280, 32767, 4440141, 0}, _kill = {si_pid = 0, si_uid = 0}, _timer = {si_tid = 0, si_overrun = 0, si_sigval = { sival_int = 4439628, sival_ptr = 0x43be4c}}, _rt = {si_pid = 0, si_uid = 0, si_sigval = {sival_int = 4439628, sival_ptr = 0x43be4c}}, _sigchld = {si_pid = 0, si_uid = 0, si_status = 4439628, si_utime = 1078036791295, si_stime = 12711189105}, _sigfault = {si_addr = 0x0}, _sigpoll = {si_band = 0, si_fd = 4439628}}}
Here is the event as traced by systemtap, [0249 ben@ben-laptop ~] $ sudo stap -vv sigquit.stap [sudo] password for ben: SystemTap translator/driver (version 0.9.8/0.141 non-git sources) Copyright (C) 2005-2009 Red Hat, Inc. and others This is free software; see the source for copying conditions. Session arch: x86_64 release: 2.6.31-rc2-ben Created temporary directory "/tmp/stap7Gig8j" Searched '/usr/share/systemtap/tapset/x86_64/*.stp', found 3 Searched '/usr/share/systemtap/tapset/*.stp', found 51 Pass 1: parsed user script and 54 library script(s) in 230usr/20sys/502real ms. probe __send_signal@:-1 kernel reloc=.dynamic section=.text pc=0xffffffff8106b300 probe send_sigqueue@kernel/signal.c:1345 kernel reloc=.dynamic section=.text pc=0xffffffff8106b1a0 probe force_sig@kernel/signal.c:1259 kernel reloc=.dynamic section=.text pc=0xffffffff8106b6d0 probe send_sig@kernel/signal.c:1253 kernel reloc=.dynamic section=.text pc=0xffffffff8106c700 probe send_sig_info@kernel/signal.c:1231 kernel reloc=.dynamic section=.text pc=0xffffffff8106c670 probe force_sig_info@kernel/signal.c:986 kernel reloc=.dynamic section=.text pc=0xffffffff8106b5c0 WARNING: read-only local variable 'pid_name' (alternatives: sig_name sig_pid): identifier 'pid_name' at sigquit.stap:37:15 source: sig_name, pid_name, sig_pid, execname(), uid()) ^ WARNING: read-only local variable 'sig_pid' (alternatives: sig_name pid_name): identifier 'sig_pid' at :37:25 source: sig_name, pid_name, sig_pid, execname(), uid()) ^ Pass 2: analyzed script: 6 probe(s), 13 function(s), 18 embed(s), 0 global(s) in 1040usr/2120sys/206343real ms. Pass 3: using cached /home/ben/.systemtap/cache/76/stapconf_766082f19d0d792182ae7d8592ffdb3e_480.h Pass 3: using cached /home/ben/.systemtap/cache/a5/stap_a5993b4da2481ff59a7d3fabcfcdf00c_17760.c Pass 4: using cached /home/ben/.systemtap/cache/a5/stap_a5993b4da2481ff59a7d3fabcfcdf00c_17760.ko Pass 5: starting run. Running /usr/bin/staprun -v /tmp/stap7Gig8j/stap_a5993b4da2481ff59a7d3fabcfcdf00c_17760.ko send_signal: SIGQUIT was sent to X (pid:2787) by events/1 uid:0 0xffffffff8106b301 : T.649+0x1/0x2c0 [kernel] 0xffffffff8106b8f3 : __group_send_sig_info+0x13/0x20 [kernel] 0xffffffff8106c254 : group_send_sig_info+0x54/0x90 [kernel] 0xffffffff8106c428 : __kill_pgrp_info+0x48/0x80 [kernel] 0xffffffff8106c4a0 : kill_pgrp+0x40/0x60 [kernel] 0xffffffff812eab52 : n_tty_receive_buf+0x482/0x12e0 [kernel] 0xffffffff812ee373 : flush_to_ldisc+0x103/0x1d0 [kernel] 0xffffffff81070d0a : worker_thread+0x15a/0x280 [kernel] 0xffffffff81075cbe : kthread+0x9e/0xb0 [kernel] 0xffffffff8101312a : child_rip+0xa/0x20 [kernel] 0xffffffff81075c20 : kthread+0x0/0xb0 [kernel] (inexact) 0xffffffff81013120 : child_rip+0x0/0x20 [kernel] (inexact)
(In reply to Ben Gamari from comment #2) > send_signal: SIGQUIT was sent to X (pid:2787) by events/1 uid:0 > 0xffffffff8106b301 : T.649+0x1/0x2c0 [kernel] > 0xffffffff8106b8f3 : __group_send_sig_info+0x13/0x20 [kernel] > 0xffffffff8106c254 : group_send_sig_info+0x54/0x90 [kernel] > 0xffffffff8106c428 : __kill_pgrp_info+0x48/0x80 [kernel] > 0xffffffff8106c4a0 : kill_pgrp+0x40/0x60 [kernel] > 0xffffffff812eab52 : n_tty_receive_buf+0x482/0x12e0 [kernel] > 0xffffffff812ee373 : flush_to_ldisc+0x103/0x1d0 [kernel] This is the tty layer saying you hit ^\ (or whatever else you have mapped to SIGQUIT with stty). That's almost certainly not X's fault, though there have been cases where the input drivers didn't put things enough in raw mode and thus the kernel would still process those events. If you think that's what's happening to you, please open a new bug so we can track it there.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.