Bug 16167 - hitting a key crashes X server
Summary: hitting a key crashes X server
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/Input/Core (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Peter Hutterer
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-30 08:38 UTC by Johannes Engel
Modified: 2008-10-09 23:44 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Console log of Xorg & sleep 10s ; xkbcomp :0 (2.94 KB, text/plain)
2008-05-30 08:38 UTC, Johannes Engel
no flags Details
Xorg.0.log (33.16 KB, text/plain)
2008-05-30 08:39 UTC, Johannes Engel
no flags Details
0001-xkb-delete-default-rules-when-devices-are-closed-1.patch (2.34 KB, patch)
2008-05-31 23:00 UTC, Peter Hutterer
no flags Details | Splinter Review
0001-xkb-reset-xkb_cached_map-on-CloseDownDevices.patch (800 bytes, patch)
2008-06-03 18:54 UTC, Peter Hutterer
no flags Details | Splinter Review
Xorg.0.log of the crash using kdm (114.70 KB, text/plain)
2008-06-04 02:21 UTC, Johannes Engel
no flags Details
dump just before the change (56.91 KB, text/plain)
2008-07-18 02:57 UTC, Johannes Engel
no flags Details
dump just after the change (48.48 KB, text/plain)
2008-07-18 02:58 UTC, Johannes Engel
no flags Details
log after change (191 bytes, application/octet-stream)
2008-07-18 03:00 UTC, Johannes Engel
no flags Details
log after change (191 bytes, text/plain)
2008-07-18 03:01 UTC, Johannes Engel
no flags Details
Crash log from gdb (6.49 KB, text/plain)
2008-07-31 09:31 UTC, Johannes Engel
no flags Details
Crash log incl. backtrace (13.28 KB, text/plain)
2008-07-31 09:46 UTC, Johannes Engel
no flags Details
0001-Xi-don-t-memcpy-the-KeyClassRec-from-SD-to-MD.patch (1.06 KB, patch)
2008-10-08 23:10 UTC, Peter Hutterer
no flags Details | Splinter Review

Description Johannes Engel 2008-05-30 08:38:42 UTC
Created attachment 16831 [details]
Console log of Xorg & sleep 10s ; xkbcomp :0

Starting the X-server works as long as I do not press any key during the startup process. Once a window manager (KDE, FVWM ...) is started the keyboard works as expected. But pressing the key during startup process (while screen is gray with that little X for the mouse cursor) crashes the whole server. The X stays in the middle of the screen, the rest of it turns black and it does not react anymore for keystrokes, except Magic SysRq keys.

I will attach the console output (xkbX_nocompose.log) and the X server's log (Xorg.0.log_nocompose) of the following command sequence:
Xorg & sleep 10s; xkbcomp :0

I tried that because the problem does not happen, if I remove the xkb rules file configured in xorg.conf.

Running xkbcomp :0 once for example KDE is started I get

Warning:          Could not load keyboard geometry for :0
                  BadAlloc (insufficient resources for operation)
                  Resulting keymap file will not describe geometry


I also did not experience this behaviour before MPX merge, though I do not know if it is related to that at all.

My hardware is a Dell Latitude D820
intel 945GM

Software:
Modular X.org from GIT
xkeyboard-config 1.3
Comment 1 Johannes Engel 2008-05-30 08:39:10 UTC
Created attachment 16832 [details]
Xorg.0.log
Comment 2 Peter Hutterer 2008-05-30 22:05:05 UTC
> Starting the X-server works as long as I do not press any key during the
> startup process. Once a window manager (KDE, FVWM ...) is started the keyboard
> works as expected. But pressing the key during startup process (while screen is
> gray with that little X for the mouse cursor) crashes the whole server. The X
> stays in the middle of the screen, the rest of it turns black and it does not
> react anymore for keystrokes, except Magic SysRq keys.

> I tried that because the problem does not happen, if I remove the xkb rules
> file configured in xorg.conf.

Verified, although in my case it is the XkbVariant nodeadkeys that causes the
SIGABRT (looks like a double free on first glance).
Here it crashes in the second server generation, i.e. if you start xterm, kill
it, wait for the server to come back and then hit a key it segfaults.
In some startup configurations, similar things happen, hence why you don't see
the problem if you're running a WM.

> I also did not experience this behaviour before MPX merge, though I do not know
> if it is related to that at all.

if we're talking about the same bug, then it is definitely caused by MPX.
Comment 3 Peter Hutterer 2008-05-31 00:23:24 UTC
> Verified, although in my case it is the XkbVariant nodeadkeys that causes the
> SIGABRT (looks like a double free on first glance).

I can't for the sake of it figure out why the SIGABRT happens. 

On the first press in the second server generation, XkbResizeKeySyms
eventually frees xkb->map->syms (of the VCK). The pointer is valid (same as
returned during XkbAllocClientMap). It is not a double-free, checked that by
printing all addresses passed into xfree. Yet when xfree is called after
swapping the pointer, libc SIGABRTs.

If the first free is avoided (gdb set xkb->map->syms = NULL before the call),
everything works nice and dandy.

daniel, any suggestions?
Comment 4 Peter Hutterer 2008-05-31 23:00:34 UTC
Created attachment 16849 [details] [review]
0001-xkb-delete-default-rules-when-devices-are-closed-1.patch

Please try this patch.

We only have one set of default rules options in xkb. When the second keyboard
is brought up with Xkb options specified, these new options overwrite the old.
In future server generations, the rules used for the VCK are a mixture of the
default ones and ones previously specified for other keyboards. Simply
resetting the xkb default rules to NULL avoids this issue.

Reproducable by setting XkbLayout "de" and XkbVariant "nodeadkeys". In the
second server generation, the VCK has "us(nodeadkeys)". This again produces a
SIGABRT when the first key is hit.

I could not figure out why the SIGABRT happens. This patch is avoiding the
issue rather than fixing it.
Comment 5 Johannes Engel 2008-06-01 06:16:34 UTC
I'm sorry to inform you that this patch does not avoid the crashes for me. :(
Comment 6 Peter Hutterer 2008-06-01 17:51:05 UTC
> I'm sorry to inform you that this patch does not avoid the crashes for me. :(

Ok, I need your help then. Can you git-bisect to the commit that broke it?
Comment 7 Johannes Engel 2008-06-02 08:49:20 UTC
No, sorry, I cannot, since I cannot even compile the xserver in the states corresponding to the mpx branch before master merge. I tried several stats (30?) and always got this error:

getevents.c: In function 'GetPointerEvents':
getevents.c:637: error: 'rawDeviceEvent' undeclared (first use in this function)
getevents.c:637: error: (Each undeclared identifier is reported only once
getevents.c:637: error: for each function it appears in.)
getevents.c:637: error: 'ev' undeclared (first use in this function)
getevents.c:705: error: expected expression before ')' token
getevents.c:707: error: 'XI_RawDeviceEvent' undeclared (first use in this function)
make[2]: *** [getevents.lo] Fehler 1
make[2]: *** Warte auf noch nicht beendete Prozesse...
make[1]: *** [all] Fehler 2
make: *** [all-recursive] Fehler 1

Which component do I need to rebuild for this error to vanish?
I would prefer not to rebuild the whole modular X with all libs for each bisecting step, since my laptop is no 128-core super server. ;)
Comment 8 Peter Hutterer 2008-06-02 16:28:47 UTC
On Mon, Jun 02, 2008 at 08:49:22AM -0700, bugzilla-daemon@freedesktop.org wrote:
> --- Comment #7 from Johannes Engel <jcnengel@googlemail.com>  2008-06-02 08:49:20 PST ---
> No, sorry, I cannot, since I cannot even compile the xserver in the states
> corresponding to the mpx branch before master merge. I tried several stats
> (30?) and always got this error:
> 
> getevents.c: In function 'GetPointerEvents':
> getevents.c:637: error: 'rawDeviceEvent' undeclared (first use in this
> function)

rawDeviceEvent was in inputproto once, but got removed again since. you need
to revert this back to a previous state as well.
 
> Which component do I need to rebuild for this error to vanish?
> I would prefer not to rebuild the whole modular X with all libs for each
> bisecting step, since my laptop is no 128-core super server. ;)

all you should ever need to rebuild is inputproto, xserver and the drivers.
don't worry about the libs.

inputproto is the hardest part, since sometimes the xserver relies on a
certain version of inputproto. in this case it's the easiest to look at the
date of the current log of the xserver, and then revert inputproto back to
something that was valid on that date.
Comment 9 Johannes Engel 2008-06-03 15:29:44 UTC
OK. I will try it again, today I got new errors during build from other include files which are totally ok with current master.
Maybe I will find the time for further investigation tomorrow.
Comment 10 Peter Hutterer 2008-06-03 18:54:12 UTC
Created attachment 16903 [details] [review]
0001-xkb-reset-xkb_cached_map-on-CloseDownDevices.patch

I managed to get a different crash on a FreeBSD machine, fixed with this patch. The patch is in addition to the previous one. Can you give this one a try too please? Thanks.
Comment 11 Johannes Engel 2008-06-04 01:12:38 UTC
Thank you, Peter, that solves a part of the problem which in fact seems to consist of two as I see now.
But this one is solved for me. :)
Comment 12 Peter Hutterer 2008-06-04 01:32:47 UTC
On Wed, Jun 04, 2008 at 01:12:39AM -0700, bugzilla-daemon@freedesktop.org wrote:
> Thank you, Peter, that solves a part of the problem which in fact seems to
> consist of two as I see now.
> But this one is solved for me. :)

what's the remaining problem?
Comment 13 Johannes Engel 2008-06-04 01:34:29 UTC
The remaining problem is that I still cannot use a graphical login manager since after logging in it still crashes. But that might have a different reason. I will have to investigate further.
Comment 14 Peter Hutterer 2008-06-04 01:39:35 UTC
On Wed, Jun 04, 2008 at 01:34:30AM -0700, bugzilla-daemon@freedesktop.org wrote:
> The remaining problem is that I still cannot use a graphical login manager
> since after logging in it still crashes. But that might have a different
> reason. I will have to investigate further.

can you provide a log file or backtrace of such a crash please? thanks.
Comment 15 Johannes Engel 2008-06-04 02:21:55 UTC
Created attachment 16909 [details]
Xorg.0.log of the crash using kdm

I can only provide a log, since I do not have a second computer to attach gdb.
Comment 16 Peter Hutterer 2008-06-04 05:41:43 UTC
On Wed, Jun 04, 2008 at 02:21:56AM -0700, bugzilla-daemon@freedesktop.org wrote:
> I can only provide a log, since I do not have a second computer to attach gdb.

usually a crash has some backtrace included, I'm missing that here. Can you
check the Xorg.0.log.old as well please and see if there's something in there?
Comment 17 Johannes Engel 2008-06-04 05:53:02 UTC
Actually that is Xorg.0.log.old, since kdm immediately restarts.
What's about these xkbcomp errors?
Comment 18 Johannes Engel 2008-06-04 07:00:50 UTC
Actually this bug seems not to be gone. But I need a few more keystrokes now to crash it. Nothing new in the logs. :(
Comment 19 Peter Hutterer 2008-06-05 05:13:00 UTC
On Wed, Jun 04, 2008 at 07:00:50AM -0700, bugzilla-daemon@freedesktop.org wrote:
> Actually this bug seems not to be gone. But I need a few more keystrokes now to
> crash it. Nothing new in the logs. :(

Any special type of keystroke? Autorepeat, modifiers, etc? or just anything,
even at a rate of say 1 per second.

Directly after the login? What happens if you wait for a while and everything
is loaded?
Comment 20 Johannes Engel 2008-06-10 06:16:29 UTC
whenever i wait until everything has settled, it is fine.

i narrowed down a little bit my problem with the session manager:
when i use kdm to login every session fails except the so called failsave session which only starts xterm. from that xterm i can start kde then. doing so everything is ok apart from the missing capital letters (as you can see). ;) but shift key works for special signs. also altgr does not work on the "q" key to produce an "at" sign (german layout), but works together with non-letter keys.

i tried "xkbcomp :0" which says
Warning:          Could not load keyboard geometry for :0
                  BadAlloc (insufficient resources for operation)
                  Resulting keymap file will not describe geometry
X Error of failed request:  BadAtom (invalid Atom parameter)
  Major opcode of failed request:  17 (X_GetAtomName)
  Atom id in failed request:  0xff
  Serial number of failed request:  17
  Current serial number in output stream:  17

everything is fine if i do not use a session manager but startx directly from the console.
Comment 21 Johannes Engel 2008-06-10 06:26:59 UTC
Let me add that in the situation mentioned before (failsave session) a simple "setxkbmap" without arguments seems to break the Xserver. Strange...
What the heck does the login manager do there?
Comment 22 Peter Hutterer 2008-06-10 17:32:02 UTC
> --- Comment #21 from Johannes Engel <jcnengel@googlemail.com>  2008-06-10 06:26:59 PST ---
> Let me add that in the situation mentioned before (failsave session) a simple
> "setxkbmap" without arguments seems to break the Xserver. Strange...
> What the heck does the login manager do there?

sounds like the server may be dereferencing uninitialised memory to me.
Comment 23 Peter Hutterer 2008-06-16 06:38:00 UTC
just for future reference:
- "xkb: delete default rules when devices are closed." was pushed as
  5a3d06b8f42473cea3741dc722a775deaa2b73f6
- "xkb: reset xkb_cached_map on CloseDownDevices." was pushed as
  ff3adf3e564d94fea18e48f966de40a7ded1279e
Comment 24 Peter Hutterer 2008-06-22 05:33:36 UTC
Sorry, I just can't reproduce this bug here at all.

> --- Comment #20 from Johannes Engel <jcnengel@googlemail.com>  2008-06-10 06:16:29 PST ---
> whenever i wait until everything has settled, it is fine.

> everything is fine if i do not use a session manager but startx directly from
> the console.

can you set up a simple loop that calls xkbcomp :0 every 5 seconds or so and
then diff the output to see what changes during the start of kde.

kde starts a number of services, is there a way to start them one-by-one to
see which one causes the crash?

You mentioned that it's fine if you wait, this indicates that hitting a key
changes the keyboard layout (it should, the core keyboard is initialised to a
standard layout, hitting the first key then transfers the layout from the
other keyboard to the core keyboard). now, somewhere we seem to screw up but I
just can't figure out where and having no backtrace doesn't make it easier.
Comment 25 Johannes Engel 2008-06-23 02:20:57 UTC
In the meantime I upgraded to openSUSE 11 with my Xorg from git which does not make any difference so far.

I created the following script running in the background as root while starting kdm (as root, too):

#!/bin/bash
counter=0
mkdir -p /tmp/xkbtracker
while [ $counter -ge 0 ] ; do
        xkbcomp :0 2>/tmp/xkbtracker/log$counter
        counter=$((counter+1))
        sleep 2s
done

The result looks like this:

Invalid MIT-MAGIC-COOKIE-1 keyError:            Cannot open display ":0"
                  Exiting
Comment 26 Peter Hutterer 2008-07-16 05:10:01 UTC
> --- Comment #25 from Johannes Engel <jcnengel@googlemail.com>  2008-06-23 02:20:57 PST ---
> I created the following script running in the background as root while starting
> kdm (as root, too):

[...]
 
> Invalid MIT-MAGIC-COOKIE-1 keyError:            Cannot open display ":0"
>                   Exiting

IIRC that's an issue if X is running for a different user (in your case the
logged in user), then the authentication is missing. Running the script as
your user should help.
Comment 27 Johannes Engel 2008-07-17 02:46:17 UTC
The first 24 log files are empty, and from the 25th they contain only the following:

Warning:          Could not load keyboard geometry for :0
                  BadAlloc (insufficient resources for operation)
                  Resulting keymap file will not describe geometry
Comment 28 Peter Hutterer 2008-07-17 17:49:08 UTC
wow, my fault, should have picked up on that earlier.
xkbcomp dumps to server-0.xkb. 

xkbcomp :0 - > file
the - tells it to dump to stdout.
Comment 29 Johannes Engel 2008-07-18 02:53:27 UTC
OK, that helps. :) I will attach two types of files: the dumps (called out) of xkbcomp and the logs (called log) of errors.
About a minute after starting X there is a change in both which seems to be connected with the type of my keyboard (my PC is a Dell Latitude D820, so I configured the keyboard to be of type "latitude").
I hope the files document that change well.
Comment 30 Johannes Engel 2008-07-18 02:57:27 UTC
Created attachment 17740 [details]
dump just before the change
Comment 31 Johannes Engel 2008-07-18 02:58:11 UTC
Created attachment 17741 [details]
dump just after the change
Comment 32 Johannes Engel 2008-07-18 03:00:20 UTC
Created attachment 17743 [details]
log after change

Please note that there was no error message before the change, so the prior logs are empty.
Comment 33 Johannes Engel 2008-07-18 03:01:14 UTC
Created attachment 17744 [details]
log after change

Please note that there was no error message before the change, so the prior logs are empty.
Comment 34 Johannes Engel 2008-07-31 09:31:44 UTC
Created attachment 18042 [details]
Crash log from gdb

I managed to derive a crash log. Does it help?
Comment 35 Johannes Engel 2008-07-31 09:46:50 UTC
Created attachment 18044 [details]
Crash log incl. backtrace

Now with full backtrace
Comment 36 Jasmin Buchert 2008-09-13 16:30:53 UTC
I have the same problem when running Xorg HEAD with XkbLayout "de" and XkbVariant "nodeadkeys".
X starts, but as soon as I press a key, it crashes. I had to revert to 1.5.

It "works" when settings XkbVariant to "deadkeys". But then I cannot write characters like @ or ~. Settings XkbLayout to "us" also seems to work.

I hope this gets fixes before release.
Comment 37 Peter Hutterer 2008-10-08 23:10:19 UTC
Created attachment 19512 [details] [review]
0001-Xi-don-t-memcpy-the-KeyClassRec-from-SD-to-MD.patch

Let's try again. Turns out a memcopy would change mapWidth, which then didn't trigger a realloc, which then led to a stray memcpy overwriting bits it shouldn't.

Not sure about the implications of this yet, need more testing. But it should stop the crash nonetheless.
Comment 38 Johannes Engel 2008-10-09 03:26:02 UTC
(In reply to comment #37)
> Created an attachment (id=19512) [details]
> 0001-Xi-don-t-memcpy-the-KeyClassRec-from-SD-to-MD.patch
> 
> Let's try again. Turns out a memcopy would change mapWidth, which then didn't
> trigger a realloc, which then led to a stray memcpy overwriting bits it
> shouldn't.
> 
> Not sure about the implications of this yet, need more testing. But it should
> stop the crash nonetheless.

Thank you, Peter. Indeed the last patch stops the crash.
Is there anything I can do about testing the further implications of this?
Comment 39 Peter Hutterer 2008-10-09 03:54:44 UTC
On Thu, Oct 09, 2008 at 03:26:08AM -0700, bugzilla-daemon@freedesktop.org wrote:
> Thank you, Peter. Indeed the last patch stops the crash.
> Is there anything I can do about testing the further implications of this?

just plug in two keyboards and use both, see if everything works fine, or at
least appears to work fine. I think the only thing to worry about is the
down[] array, but I can look into that tomorrow. Good that we finally found
the bug!
Comment 40 Julien Cristau 2008-10-09 04:57:53 UTC
> Johannes Engel <jcnengel@googlemail.com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|REOPENED                    |RESOLVED
>          Resolution|                            |WORKSFORME
> 
Please don't close bugs before the patch is actually applied...
Comment 41 Peter Hutterer 2008-10-09 23:44:46 UTC
Pushed as 4808bdec45775342eb9a6352b41e4919e1a69279.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.