Created attachment 127578 [details] [review]
On a Debian Stretch, randomly, after several minutes, X crashes.
Here are lines I get:
[957598.965] (EE) Backtrace:
[957598.966] (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4a) [0x557cd4d7601a]
[957598.966] (EE) 1: /usr/lib/xorg/Xorg (0x557cd4bbc000+0x1be389) [0x557cd4d7a389]
[957598.966] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6 (0x7f55d7ad0000+0x33040) [0x7f55d7b03040]
[957598.966] (EE) 3: /usr/lib/x86_64-linux-gnu/libinput.so.10 (0x7f55d0e36000+0xf214) [0x7f55d0e45214]
[957598.966] (EE) 4: /usr/lib/xorg/Xorg (XkbDDXKeybdCtrlProc+0x47) [0x557cd4d15127]
[957598.966] (EE) 5: /usr/lib/xorg/Xorg (XkbComputeControlsNotify+0x1a7) [0x557cd4d26387]
[957598.966] (EE) 6: /usr/lib/xorg/Xorg (0x557cd4bbc000+0x16b8c5) [0x557cd4d278c5]
[957598.966] (EE) 7: /usr/lib/xorg/Xorg (0x557cd4bbc000+0x1b6ae0) [0x557cd4d72ae0]
[957598.966] (EE) 8: /usr/lib/xorg/Xorg (WaitForSomething+0xd14) [0x557cd4d738a4]
[957598.967] (EE) 9: /usr/lib/xorg/Xorg (0x557cd4bbc000+0x53e9e) [0x557cd4c0fe9e]
[957598.967] (EE) 10: /usr/lib/xorg/Xorg (0x557cd4bbc000+0x58073) [0x557cd4c14073]
[957598.967] (EE) 11: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf1) [0x7f55d7af02b1]
[957598.967] (EE) 12: /usr/lib/xorg/Xorg (_start+0x2a) [0x557cd4bfdfea]
[957598.967] (EE) Segmentation fault at address 0xd0
Fatal server error:
[957598.967] (EE) Caught signal 11 (Segmentation fault). Server aborting
Please consult the The X.Org Foundation support.
<------> at http://wiki.x.org
[957598.967] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[957598.967] (EE) Server terminated with error (1). Closing log file.
Samuel Thibault wrote a patch I'm testing. If I no longer have crashes in next 15 days, I'll report you should apply it to fix the problem.
do you get anything else in the log before? looks like this should only happen if the device disappears and that should show up in the log. Can you attach the full log please?
Created attachment 127828 [details]
Full Xorg log
Sorry for delay. I didn't have available what you request, so needed to collect. Here's a full log for Xorg then.
FYI: no longer crash since I've patched with attached patch.
Can you install the debuginfo packages please and figure out where exactly this is coming from? I need to know what's triggering this crash so I can reproduce it here.
Specifically, the two lines between WaitForSomething and XkbComputeControlsNotify are necessary. Thanks
The full log is here:
Tell me if it's helpful, otherwise I need to know what I should install and what files I should examine.
Created attachment 128337 [details]
Xorg.log from comment #5
Please don't link to pastebins, they have a habit of expiring before we get to read them.
hmm, still missing the important bits. Please follow the instruction here to figure out the missing two lines in the backtrace (the two below XkbComputeControlsNotify)
Sorry but how should I apply this tool in my case? Once installed. I run it after crash? I run eu-addr2line /usr/bin/Xorg until crash? I'm unable to apply for me:
eu-addr2line -e /opt/xorg/bin/Xorg 0x48d1a4
Does it apply on Xorg.log or for a new running session?
you run it against /usr/bin/Xorg but the binary has to be the same as the one that produced the crash. if you updated since, you need to re-produce the log. And then you just run
eu-addr2line -e /usr/bin/Xorg 0x555dc55e08c5
eu-addr2line -e /usr/bin/Xorg 0x555dc562bae0
That should give us the two locations of the callstack that matter. Hopefully :)
Oh, and now I see that this is during the vt switch, can you try version 0.22 please, that has some extra protections against device removal crashes - this may be the issue here.
ok I'll try this week. Do I have a solution to do this from a Lightdm session manager? To run X in debug mode from login in Lifhtdm?
note that to run eu-addr2line you only need the address and the same binary around, i.e. you can still run this now if you haven't updated since, it should only take a minute.
as for lightdm debug - sorry, no idea but there is no real "debug" mode in X anyway, the only thing you can do is to increase the verbosity in the log file. Which probably wouldn't help much in this case, the backtrace addresses will be more useful.
After a bit of googling, the bug below suggests there's a way to pass commandline flags though, try -logverbose 7 or 10 to see more log messages.
Created attachment 128462 [details]
I forgot to write: also, the related users have the habit of switching
to a text VT and work there with brltty while Xorg is still running in
the other VT.
Grmbl, it seems that bugzilla ate my comment: Hello, Here is a backtrace obtained in a crash very similar to the situation here, which is what prompted me to write the patch that Jean-Philippe attached to this bug. The proposed patch also did fix the following bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838703 The relation between all these crashes is that the users have brltty running, which runs a uinput device. Jean-Philippe, could you try Debian's xserver-xorg-input-libinput 0.23.0-1, as suggested by Emilio in the Debian bug entry? Samuel
(In reply to Samuel Thibault from comment #12)
> Created attachment 128462 [details]
Looks like your backtrace attachment is actually the comment that disappeared. Can you attach the proper backtrace?
Created attachment 128466 [details]
Anyway, here is the backtrace again
backtrace suggests that the device is already NULL by the time the led updated is called, if so that's another high chance that the fix from 0.23 addresses this issue, I really need someone to test that version please.
In http://bugs.debian.org/838703 , Sebastian Humenda reports that 0.23 doesn't fix the issue, and he has to apply the proposed patch to avoid the crashes.
Created attachment 128529 [details]
FYI, I still have the core file, and I have rebuilt a chroot containing the proper versions, which I have uploaded to:
after chrooting into it, run
gdb /usr/lib/xorg/Xorg core
and the source lines are all available and correct.
I'm also attaching to this bug some debugging prints from there.
I can still confirm the issue with the latest version, 0.23.0. When I apply the patch, the problem is gone.
Author: Peter Hutterer <email@example.com>
Date: Tue Dec 20 15:36:55 2016 +1000
Ignore LED updates for disabled devices