Bug 18111 - Fatal server error EnableDevice on Xorg git startup
Summary: Fatal server error EnableDevice on Xorg git startup
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Input/Keyboard (show other bugs)
Version: git
Hardware: PowerPC Linux (All)
: high major
Assignee: Peter Hutterer
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-17 12:05 UTC by Steve Winiecki
Modified: 2008-10-23 08:13 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log for Fatal server error EnableDevice on Xorg git startup (34.58 KB, text/plain)
2008-10-17 12:05 UTC, Steve Winiecki
no flags Details
Possible fix: look at all bytes of dev->enabled (2.37 KB, patch)
2008-10-22 06:55 UTC, Michel Dänzer
no flags Details | Splinter Review

Description Steve Winiecki 2008-10-17 12:05:58 UTC
Created attachment 19726 [details]
Xorg log for Fatal server error EnableDevice on Xorg git startup

Using full git xorg version built this week.

PPC 4xx 32-bit platform.

When starting X get the following crash:

---------
Backtrace:
0: /usr/X11R7.4/bin/X(xorg_backtrace+0x4c) [0x101002d0]
1: /usr/X11R7.4/bin/X(xf86SigHandler+0x68) [0x10085bdc]
2: [0x100374]
3: /lib/ld.so.1 [0x4800b6f4]
4: [0x4d]
5: /usr/X11R7.4/bin/X(xf86PostKeyboardEvent+0x58) [0x1009870c]
6: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf6811e0]
7: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf68167c]
8: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf68189c]
9: /usr/X11R7.4/bin/X(EnableDevice+0x16c) [0x1003a380]
10: /usr/X11R7.4/bin/X(InitAndStartDevices+0x158) [0x1003a63c]
11: /usr/X11R7.4/bin/X(main+0x380) [0x100229e4]
12: /lib/tls/libc.so.6 [0xfa36994]
13: /lib/tls/libc.so.6(__libc_start_main+0xb0) [0xfa36ad0]

Fatal server error:
Caught signal 11.  Server aborting
----------

Noteworthy are messages surfaced from EnableDevice() in xserver/dix/devices.c :
[dix] cannot find pointer to pair with. This is a bug

Using USB keyboard/mouse

9:~/xorg-git/xserver# cat /proc/bus/input/devices
I: Bus=0003 Vendor=05ac Product=0201 Version=0100
N: Name="Mitsumi Electric Apple USB Keyboard"
P: Phys=usb-PPC-OF USB-1.3.1/input0
S: Sysfs=/class/input/input0
U: Uniq=
H: Handlers=kbd event0
B: EV=120013
B: KEY=10000 7 ff9f207a c14057ff febeffdf ffefffff ffffffff fffffffe
B: MSC=10
B: LED=1f

I: Bus=0003 Vendor=05ac Product=0307 Version=0110
N: Name="Logitech Apple Optical USB Mouse"
P: Phys=usb-PPC-OF USB-1.3.2/input0
S: Sysfs=/class/input/input1
U: Uniq=
H: Handlers=mouse0 event1
B: EV=17
B: KEY=10000 0 0 0 0 0 0 0 0
B: REL=3
B: MSC=10

Relevant xorg.conf sections:

Section "ServerLayout"
        Identifier     "X.org Configured"
        Screen      0  "Screen0" 0 0
        InputDevice    "Mouse0" "CorePointer"
        InputDevice    "Keyboard0" "CoreKeyboard"
EndSection
...
Section "InputDevice"
        Identifier  "Keyboard0"
        Driver      "kbd"
EndSection

Section "InputDevice"
        Identifier "Mouse0"
        Driver "mouse"
        Option "Protocol" "ExplorerPS/2"
        Option "Device" "/dev/input/mice"
        Option "ZAxisMapping" "4 5 6 7"
EndSection
...
       
Using the same HW/kernel/X conf, this problem does not occur with Xorg 7.3 system w/ xorg-server 1.4 (or previous Xorg versions ie. Debian Etch w/ Xorg 7.1).  Have also tried another kbd/mouse device (ThinkPad USB kbd w/ integrated trackpoint/touchpad - which also works with previous Xorg versions) which also fails similar. 

Full Xorg.0.log attached.
Comment 1 Michel Dänzer 2008-10-21 00:14:52 UTC
I'm using the evdev driver, so I get a different backtrace:

0: X [0x100ae894]
1: X(xf86SigHandler+0xc8) [0x100ae824]
2: [0x100344]
3: [0x28]
4: X(NewInputDeviceRequest+0x504) [0x100c7d24]
5: X [0x10089080]
6: X [0x10089560]
7: X [0x10088130]
8: X [0x10146f94]
9: X(WaitForSomething+0x7e8) [0x101478ec]
10: X(Dispatch+0xac) [0x1004c5e8]
11: X(main+0x5b4) [0x10028b8c]
12: /lib/libc.so.6 [0xf906704]
13: /lib/libc.so.6 [0xf9068c0]

(Unfortunately, I can't seem to get more information about the crash with gdb, it just hangs instead of giving me a prompt... Steve, are you seeing this as well? Anyway, I've narrowed it down using debugging output to CheckMotion() crashing because pSprite is NULL.)

But I think the key is really

[dix] cannot find pointer to pair with. This is a bug.

I've bisected this to xserver commit 1e24e7b9df3d02350c7ea18e9379e87fe4d00026 ('Xi: remove configure/query device property calls.'), but I can't see what in there could cause the symptoms we're seeing; also obviously it doesn't happen on x86... So it could be related to endianness or char being unsigned by default, or maybe just some kind of latent memory corruption issue that happens not to affect x86. (I think I've ruled out a compiler optimization bug by rebuilding everything affected by this change with -O0).

Peter, any suggestions for narrowing down why it's unable to find a pointer for pairing?
Comment 2 Daniel Stone 2008-10-21 09:05:47 UTC
On Tue, Oct 21, 2008 at 12:14:57AM -0700, bugzilla-daemon@freedesktop.org wrote:
> (Unfortunately, I can't seem to get more information about the crash with gdb,
> it just hangs instead of giving me a prompt... Steve, are you seeing this as
> well? Anyway, I've narrowed it down using debugging output to CheckMotion()
> crashing because pSprite is NULL.)
> 
> But I think the key is really
> 
> [dix] cannot find pointer to pair with. This is a bug.
> 
> I've bisected this to xserver commit 1e24e7b9df3d02350c7ea18e9379e87fe4d00026
> ('Xi: remove configure/query device property calls.'), but I can't see what in
> there could cause the symptoms we're seeing; also obviously it doesn't happen
> on x86... So it could be related to endianness or char being unsigned by
> default, or maybe just some kind of latent memory corruption issue that happens
> not to affect x86. (I think I've ruled out a compiler optimization bug by
> rebuilding everything affected by this change with -O0).
> 
> Peter, any suggestions for narrowing down why it's unable to find a pointer for
> pairing?

Stupid question, but you have rebuilt -evdev against the exact same
headers, right?
Comment 3 Michel Dänzer 2008-10-21 09:10:55 UTC
(In reply to comment #2)
> 
> Stupid question, but you have rebuilt -evdev against the exact same
> headers, right?

Yes, the behaviour is the same for me with current evdev Git built against current xserver Git.
Comment 4 Peter Hutterer 2008-10-21 23:46:50 UTC
The error message is definitely a hint. What should happen is that the VCP
initialises, then the VCK, and the VCK should get paired with the VCP. If that
doesn't happen, any operation on the VCK may just segfault due to a
nonexistent sprite.

Not sure how you got there though, some memory corruption somewhere. anything
to narrow it down would be appreciated. valgrind complaining about anything?
Comment 5 Michel Dänzer 2008-10-22 06:55:06 UTC
Created attachment 19814 [details] [review]
Possible fix: look at all bytes of dev->enabled

Okay, I've been able to trace this with gdb (I guess I was inadvertently using the kernel DRM and thus hitting the xkbcomp fork hilarity...), and it indeed looks like a classic endianness bug:

dev->enabled is a Bool, which is typedefed to int. However, the XI_PROP_ENABLED related code in dix/devices.c only looks at the first byte of it (which happens to work with little endian). This patch fixes it for me, but I'm not sure how it fits into the bigger picture; another possibility would be to use a CARD8 local variable instead of dev->enabled directly in the XIChangeDeviceProperty() callers.
Comment 6 Steve Winiecki 2008-10-22 10:56:52 UTC
Applying the patch fixed the error for me - with devices configured using kbd or evdev.

Thanks so much for the prompt attention.
Comment 7 Peter Hutterer 2008-10-23 08:13:51 UTC
Fix pushed as 98f01c2abe4771d76febf8fe70111b2bddfab776.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.