Bug 19176 - CheckMotion for uninitialized devices segfaults the server
Summary: CheckMotion for uninitialized devices segfaults the server
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: unspecified
Hardware: Other All
: high critical
Assignee: Keith Packard
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: xserver-1.6
  Show dependency treegraph
 
Reported: 2008-12-19 02:02 UTC by Alexia Death
Modified: 2009-01-10 18:52 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
The script I use to configure tablets on demand (7.44 KB, text/x-python)
2008-12-19 02:02 UTC, Alexia Death
no flags Details
0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch (1.11 KB, patch)
2008-12-22 16:09 UTC, Peter Hutterer
no flags Details | Splinter Review

Description Alexia Death 2008-12-19 02:02:15 UTC
Created attachment 21310 [details]
The script I use to configure tablets on demand

My server version is 1.5.99.3, on Ubuntu Jaunty.

Ive been building a hotplug tool that uses DBUS intrface to manage WACOM devices. This tool worked without disturbing crashes in 1.5.2 but not in the updated version. You can test for this bug using DBUS send commands.

Heres the backtrace and relevant bits...


(II) LoadModule: "wacom"                                                                                                                                        
(II) Loading /usr/lib/xorg/modules/input//wacom_drv.so                                                                                                          
(II) Module wacom: vendor="X.Org Foundation"                                                                                                                    
        compiled for 1.5.99.3, module version = 1.0.0                                                                                                           
        Module class: X.Org XInput Driver                                                                                                                       
        ABI class: X.Org XInput driver, version 4.0                                                                                                             
(II) Wacom driver level: 47-0.8.1-4 $                                                                                                                           
(**) stylus-wacom-tablet-bamboofun-6x8: always reports core events                                                                                              
(**) stylus-wacom-tablet-bamboofun-6x8 device is /dev/input/tablet-bamboofun-6x8                                                                                
(**) stylus-wacom-tablet-bamboofun-6x8 is in absolute mode                                                                                                      
(**) WACOM: suppress value is 2                                                                                                                                 
(**) Option "BaudRate" "9600"                                                                                                                                   
(**) stylus-wacom-tablet-bamboofun-6x8: serial speed 9600                                                                                                       
(II) XINPUT: Adding extended input device "stylus-wacom-tablet-bamboofun-6x8" (type: Wacom Stylus)                                                              
(**) Option "Device" "/dev/input/tablet-bamboofun-6x8"                                                                                                          
stylus-wacom-tablet-bamboofun-6x8 Wacom X driver grabbed event device                                                                                           
(==) Wacom using pressure threshold of 30 for button 1                                                                                                          
(==) Wacom USB BambooFun tablet speed=9600 (38400) maxX=21648 maxY=13530 maxZ=511 resX=2540 resY=2540  tilt=disabled                                            
(==) Wacom device "stylus-wacom-tablet-bamboofun-6x8" top X=0 top Y=0 bottom X=21648 bottom Y=13530 resol X=2540 resol Y=2540                                   

Backtrace:
0: /usr/bin/X(xorg_backtrace+0x3b) [0x813380b]
1: /usr/bin/X(xf86SigHandler+0x55) [0x80c6db5]
2: [0xb805f400]
3: /usr/bin/X [0x80d7ea0]
4: /usr/bin/X(NewInputDeviceRequest+0x263) [0x80d81b3]
5: /usr/bin/X [0x80abd34]
6: /lib/libdbus-1.so.3 [0xb7f5b6d5]
7: /lib/libdbus-1.so.3(dbus_connection_dispatch+0x404) [0xb7f4d154]
8: /lib/libdbus-1.so.3 [0xb7f4d537]
9: /usr/bin/X [0x80ab3b3]
10: /usr/bin/X(WakeupHandler+0x52) [0x8091192]
11: /usr/bin/X(WaitForSomething+0x1bb) [0x813108b]
12: /usr/bin/X(Dispatch+0x7e) [0x808d13e]
13: /usr/bin/X(main+0x3bd) [0x80720ed]
14: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb7c1d775]
15: /usr/bin/X [0x80715a1]
Saw signal 11.  Server aborting.
(II) ImPS/2 Logitech Wheel Mouse: Close
(II) UnloadModule: "evdev"
(II) Macintosh mouse button emulation: Close
(II) UnloadModule: "evdev"
(II) Video Bus: Close
(II) UnloadModule: "evdev"
(II) Video Bus: Close
(II) UnloadModule: "evdev"
(II) AT Translated Set 2 keyboard: Close
(II) UnloadModule: "evdev"
(II) UnloadModule: "synaptics"
(II) UnloadModule: "wacom"
Comment 1 Peter Hutterer 2008-12-20 21:15:47 UTC
can you get a backtrace from a server with debug symbols? that should make it easier to pin the bug down.
Comment 2 Alexia Death 2008-12-21 01:43:52 UTC
(In reply to comment #1)
> can you get a backtrace from a server with debug symbols? that should make it
> easier to pin the bug down.
> 

The trace above is from a server with debug symbols enabled/installed(provided by xserver-xorg-core-dbg package). Do you have problems with replicating the bug? I am highly motivated to help you find it and fix it, but its a bit above my head to figure it out on my own in sensible timeframe, so if you have a patch with some debug code or alike you want me to try just let me know.
Comment 3 Alexia Death 2008-12-21 01:46:15 UTC
(In reply to comment #1)
> can you get a backtrace from a server with debug symbols? that should make it
> easier to pin the bug down.
> 

PS, If you have an irc channel or something where you discuss things, let me know, perhaps I can help out better more interactively.
Comment 4 Peter Hutterer 2008-12-21 03:25:00 UTC
> PS, If you have an irc channel or something where you discuss things, let me
> know, perhaps I can help out better more interactively.

#xorg-devel on freenode. I haven't really looked into it yet. Hoped that I'd
get a better backtrace for the whole thing before starting on it.
Can you get a printout of the exact parameters your script provides to the
server that causes the crash?
Comment 5 Alexia Death 2008-12-21 03:40:06 UTC
(In reply to comment #4)
> > PS, If you have an irc channel or something where you discuss things, let me
> > know, perhaps I can help out better more interactively.
> 
> #xorg-devel on freenode. I haven't really looked into it yet. Hoped that I'd
> get a better backtrace for the whole thing before starting on it.
> Can you get a printout of the exact parameters your script provides to the
> server that causes the crash?
> 

Using hal causes the same as my script. But I can do one better on your request. I can give you a dbus-send command.

dbus-send --system --type=method_call --print-reply --dest=org.x.config.display0 /org/x/config/0 org.x.config.input.add array:string:driver,wacom array:string:identifier,stylus array:string:Device,/dev/input/wacom array:string:Type,stylus array:string:DebugLevel,11;

I figured it out with dbus-send before going to python. But as said, HAL add results in same crash.
Comment 6 Peter Hutterer 2008-12-21 23:20:58 UTC
Does this only happen with wacom tablets?

I can't reproduce that with wacom 0.8.1-6 using HAL to add the device. Removing the device still kills the server but that's a long standing bug in the wacom driver.
Comment 7 Alexia Death 2008-12-22 06:44:30 UTC
(In reply to comment #6)
> Does this only happen with wacom tablets?
> 
> I can't reproduce that with wacom 0.8.1-6 using HAL to add the device. Removing
> the device still kills the server but that's a long standing bug in the wacom
> driver.
> 

But you can reproduce it for dbus? I do not have any other input device than a wacom tablet to test.

You are right about hal, it segfaults on remove. However,the long standing issue with wacom driver was fixed in recent releases and ie intrepid has crash free unplug with hal. The removal segfault backtrace looked different as well(trace for that clearly came from wacom module, not so this time, instead it seemed to go through dbus module). However, since it was a build switch bug, it has potential for regression. I took part in finding the removal bug it so I can easily test for it. Will do so and let you know.
Comment 8 Peter Hutterer 2008-12-22 16:09:50 UTC
Created attachment 21416 [details] [review]
0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch

This should fix the issue.
Comment 9 Alexia Death 2008-12-23 06:11:47 UTC
(In reply to comment #8)
> Created an attachment (id=21416) [details]
> 0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch
> 
> This should fix the issue.
> 

This removes segfault from adding the device. Removal still kills the server and does so without leaving a trace...

The only interesting thing I found from logs was that x tried to unload the module after I removed the first of three devices I had added using wacom module and that was the last entry in the log... Is there a check missing about not unloading a module that is still used by something?
Comment 10 Alexia Death 2008-12-23 06:17:23 UTC
(In reply to comment #9)
> The only interesting thing I found from logs was that x tried to unload the
> module after I removed the first of three devices I had added using wacom
> module and that was the last entry in the log... Is there a check missing about
> not unloading a module that is still used by something?
> 

In fact, I tested with adding just one device, the result is same. And what I deduct from this log:

(**) stylus-wacom-tablet-bamboofun-6x8: always reports core events
(**) stylus-wacom-tablet-bamboofun-6x8 device is /dev/input/tablet-bamboofun-6x8
(**) stylus-wacom-tablet-bamboofun-6x8 is in absolute mode
(**) WACOM: suppress value is 2
(**) Option "BaudRate" "9600"
(**) stylus-wacom-tablet-bamboofun-6x8: serial speed 9600
(II) XINPUT: Adding extended input device "stylus-wacom-tablet-bamboofun-6x8" (type: Wacom Stylus)
(**) Option "Device" "/dev/input/tablet-bamboofun-6x8"
stylus-wacom-tablet-bamboofun-6x8 Wacom X driver grabbed event device
(==) Wacom using pressure threshold of 30 for button 1
(==) Wacom USB BambooFun tablet speed=9600 (38400) maxX=21648 maxY=13530 maxZ=511 resX=2540 resY=2540  tilt=disabled
(==) Wacom device "stylus-wacom-tablet-bamboofun-6x8" top X=0 top Y=0 bottom X=21648 bottom Y=13530 resol X=2540 resol Y=2540
(II) UnloadModule: "wacom"

On remove, the module is unloaded before it even tries to uninitialize the device. There should be a lot more output from the driver about removal of the device before X can unload it. There is none.
Comment 11 Alexia Death 2008-12-23 06:26:05 UTC
(In reply to comment #10)
> On remove, the module is unloaded before it even tries to uninitialize the
> device. There should be a lot more output from the driver about removal of the
> device before X can unload it. There is none.
> 

Sorry, my deduction for single device was wrong. My Debug level was too low to see that uninit is completed successfully. However, unload is still done after I remove just one out of 3 and that sounds like a terribly bad idea.
Comment 12 Alexia Death 2008-12-23 10:06:36 UTC
Ive found the bit of code that causes crash on remove.

is in /home/alexiade/build/Xsource2/xorg-server-1.5.99.3/hw/xfree86/common/xf86Helper.c function xf86DeleteInput

the code itself is:

    /* This should *really* be handled in drv->UnInit(dev) call instead, but
     * if the driver forgets about it make sure we free it or at least crash
     * with flying colors */
    if (pInp->private)
	xfree(pInp->private);

Commenting it out makes the crash go away... I have no clue why it does this tho... The if condition is not sufficient?
Comment 13 Peter Hutterer 2009-01-04 17:50:21 UTC
(In reply to comment #8)
> Created an attachment (id=21416) [details]
> 0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch
> 
> This should fix the issue.


patch pushed as e1a3a1a0d85c9971aea65c2228b5fd4dbf3bf57a.

Comment 14 Peter Hutterer 2009-01-04 17:59:30 UTC
Making this a blocker for 1.6, the patch is nominated already, but we need to also sort out the remaining crash.
Comment 15 Peter Hutterer 2009-01-04 22:52:59 UTC
(In reply to comment #12)
> Ive found the bit of code that causes crash on remove.
> 
> is in
> /home/alexiade/build/Xsource2/xorg-server-1.5.99.3/hw/xfree86/common/xf86Helper.c
> function xf86DeleteInput
> 
> the code itself is:
> 
>     /* This should *really* be handled in drv->UnInit(dev) call instead, but
>      * if the driver forgets about it make sure we free it or at least crash
>      * with flying colors */
>     if (pInp->private)
>         xfree(pInp->private);
> 
> Commenting it out makes the crash go away... I have no clue why it does this
> tho... The if condition is not sufficient?

this is the driver's fault. from xf86WcmUninit (wcmConfig.c):

   xfree(priv);
   xf86DeleteInput(local, 0);

priv is freed by the driver, then the server tries to free it again. simply setting local->private to NULL after the xfree call stops the crash.
Comment 16 Alexia Death 2009-01-05 00:36:47 UTC
> this is the driver's fault. from xf86WcmUninit (wcmConfig.c):
> 
>    xfree(priv);
>    xf86DeleteInput(local, 0);
> 
> priv is freed by the driver, then the server tries to free it again. simply
> setting local->private to NULL after the xfree call stops the crash.
> 

Great :) I will make sure this gets fixed upstream.
Comment 18 Peter Hutterer 2009-01-06 14:04:21 UTC
Reassigning to Keith, patch is in master, nominated for cherry-pick. Please close when cherry-picked.

http://cgit.freedesktop.org/xorg/xserver/commit/?id=e1a3a1a0d85c9971aea65c2228b5fd4dbf3bf57a

xfree86: don't call CheckMotion if a device hasn't been enabled. #19176
Comment 19 Julien Cristau 2009-01-10 18:52:08 UTC
fix cherry-picked as 2ce48363b862db134624797bc071f8c45323a075 on server-1.6-branch, closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.