Bug 23048 - evdev: SIGSEGV in EvdevMBEmuBlockHandler()
evdev: SIGSEGV in EvdevMBEmuBlockHandler()
Status: RESOLVED FIXED
Product: xorg
Classification: Unclassified
Component: Input/evdev
7.4 (2008.09)
All Linux (All)
: high critical
Assigned To: Peter Hutterer
Xorg Project Team
:
: 24170 (view as bug list)
Depends on:
Blocks: xorg-7.5
  Show dependency treegraph
 
Reported: 2009-07-30 14:38 UTC by Bryce Harrington
Modified: 2009-10-13 01:39 UTC (History)
3 users (show)

See Also:


Attachments
ThreadStacktrace.txt (1.81 KB, text/plain)
2009-07-30 14:38 UTC, Bryce Harrington
no flags Details
0001-Finalize-the-middle-button-emulation-when-a-read-err.patch (1.16 KB, patch)
2009-10-06 03:22 UTC, Peter Hutterer
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Bryce Harrington 2009-07-30 14:38:57 UTC
Created attachment 28207 [details]
ThreadStacktrace.txt

Forwarding this widely reported Ubuntu bug:
https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-input-evdev/+bug/343528

[Problem]
X crash in EvdevMBEmuBlockHandler() seen in Jaunty and Karmic (with -evdev 2.2.2).  Occurs rarely, with no obvious/known steps to reproduce.

[Original Report]

I wasn't doing anything new the time this program crashed. The screen went blank for a few seconds with some vertical colored strands crossing here and there Then, a command-line display came up with a few error messages. The system then went to the login window. From there everything worked fine. I have a feeling that this happened because I'm putting too much pressure on my machine. At the time I was having wuala, transmission, firefox, xchat working. And by the way, this is not the first time this crash happened but I didn't get the crash report in previous crashed because I've rebooted the system right after it happened.

$ lsb_release -rd
Description: Ubuntu jaunty (development branch)
Release: 9.04

ProblemType: Crash
Architecture: i386
DistroRelease: Ubuntu 9.04
ExecutablePath: /usr/bin/Xorg
Package: xserver-xorg-core 2:1.6.0-0ubuntu1
ProcAttrCurrent: unconfined
ProcCmdline: /usr/X11R6/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7
ProcEnviron:
 LANGUAGE=ar_SA:ar
 PATH=(custom, no user)
 LANG=ar_SA.UTF-8
ProcVersion: Linux version 2.6.28-9-generic (buildd@palmer) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu2) ) #31-Ubuntu SMP Wed Mar 11 15:43:58 UTC 2009
Signal: 11
SourcePackage: xorg-server
StacktraceTop:
 EvdevMBEmuBlockHandler ()
 BlockHandler ()
 WaitForSomething ()
 Dispatch ()
 main ()
Title: Xorg crashed with SIGSEGV in EvdevMBEmuBlockHandler()
Uname: Linux 2.6.28-9-generic i686
Comment 1 Bryce Harrington 2009-07-30 14:40:00 UTC
Seems pretty clear to be a null pointer dereference:

 pEvdev = (EvdevPtr) 0x0

probably at this point:

    if (pEvdev->emulateMB.pending)

Dunno why pEvdev is 0x0 though.
Comment 2 Peter Hutterer 2009-08-02 17:45:02 UTC
Looks like it's the same as Fedora Bug 483297.
https://bugzilla.redhat.com/show_bug.cgi?id=483297

We haven't been able to identify the source of this bug and our attempts to reproduce it reliably were unsuccessful. We really need a reliable test case for this.
Comment 3 Kiri 2009-09-27 05:09:04 UTC
In 1 of the backtraces I have, it crashes from EvdevMBEmuTimer instead of EvdevMBEmuBlockHandler.

3: /usr/lib/xorg/modules/input//evdev_drv.so(EvdevMBEmuTimer+0x3f) [0xf00fcf]
4: /usr/lib/xorg/modules/input//evdev_drv.so(EvdevMBEmuWakeupHandler+0x59) [0xf010c9]

Just as the first poster in the Ubuntu bug, i have an
Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller
.  Similar to the Fedora bug, I have seen messages about a disabled USB port or hub.  I have experienced the bug in Debian as well.  I filed what is likely a duplicate bug report as FreeDesktop Bug 24170.
Comment 4 Gordon Jin 2009-09-27 18:31:52 UTC
*** Bug 24170 has been marked as a duplicate of this bug. ***
Comment 5 Shahar Or 2009-09-28 06:35:12 UTC
I've experienced this once.
Comment 6 Kiri 2009-10-01 03:37:58 UTC
I suppose a user work around should be possible to use the old mousedrv and keyboard driver instead of EvDev. I have not figured out how to do this.  Even if EvDev is not specified in xorg.conf and the other drivers are specified, the X server employs EvDev.
Comment 7 Bryce Harrington 2009-10-02 16:58:29 UTC
Two people say they got this crash using the following steps to produce the error:


A.  Just ran the following script to fix a stuttering mouse and got this crash.

#!/bin/sh
sudo btnx -k ; sudo modprobe -r usbhid ; sudo modprobe usbhid ; sudo btnx -b


B.  I was doing the following when this crash happened:

* I was running WinXP in Virtualbox with very heavy hard drive activity while on battery
* As I got the "Battery critically low" message, I plugged in AC, but apparently gpm did not recognize the plugging immediately
* So my laptop suspended (because I had configured that way) while running Virtualbox and heavy HDD (and possibly CPU) load
* After resume, I got thrown into VT1.
* I switched manually to VT7, and moments later a new X server started on VT8; after logging in, I got the apport crash handler.
Comment 8 Kiri 2009-10-04 07:59:49 UTC
My system is vulnerable to this bug, however
modprobe -r usbhid ; modprobe usbhid
does not reproduce it for me.
Comment 9 Matt Zimmerman 2009-10-05 04:27:03 UTC
While we don't have a reproduction recipe for this bug yet, we do receive regular confirmation reports from Ubuntu users.  If it would be possible to add some instrumentation to the code to help track down the bug, we could ship such a patch in Ubuntu temporarily in order to harvest the logs when the bug turns up.
Comment 10 njw 2009-10-06 03:12:24 UTC
Does anyone want core dumps?
Comment 11 Peter Hutterer 2009-10-06 03:22:44 UTC
Created attachment 30107 [details] [review]
0001-Finalize-the-middle-button-emulation-when-a-read-err.patch

I think this should fix it. If I read the log right, the necessary steps to reproduce the bug are:

get a read error ENODEV on the device, hope that the reopen timer manages to find the device again if it comes back. Then really remove the device, possibly multiple times. Look out for a message like this in the log:

[32099.630886] (EE) HID 062a:0000: Read error: No such device
[32100.187449] (II) HID 062a:0000: Device reopened after 5 attempts.

If that message occurs, removing the device and re-adding should lead to a crash. The read error seems to be quite difficult to trigger, hence why I haven't been able to reproduce it myself. I would explain why removing the kernel module seems to work quite well.

Anyway, in the code, what happens is that in this case the reopen timer isn't removed. When the device is reopened, the timer is overwritten and the old one hangs around, eventually crashing the server.

Please test this patch and let me know if it works.
Comment 12 Kiri 2009-10-09 09:04:38 UTC
What version is your patch to?
It does not apply to the latest release.
Comment 13 Peter Hutterer 2009-10-09 18:15:39 UTC
(In reply to comment #12)
> What version is your patch to?
> It does not apply to the latest release.

Seems to apply fine on top of 2.2.99.2 and 2.2.5 here. Either way, it's just one line so if it doesn't apply locally you can just copy/paste the one line into the code manually.

Comment 14 Kiri 2009-10-10 13:52:30 UTC
I am testing the patch now. I had not known about the independant releases of modules.  4hours running and no crash yet.

Is there a way to disable middle button emulation altogether at the user level?
I know about
Option "Emulate3Buttons"
but I don't know how to apply it or whether it is possible because evdev scans for devices and creates InputDevices at runtime.
Comment 15 Peter Hutterer 2009-10-11 17:07:03 UTC
(In reply to comment #14)
> I am testing the patch now. I had not known about the independant releases of
> modules.  4hours running and no crash yet.

please try to reproduce the crash cases earlier, they all include some removal of the devices so if you can try to trigger it this way, please do so. normal usage of the desktop will almost certainly not trigger the bug unless you have busted hardware or you suspend/resume a lot.

> Is there a way to disable middle button emulation altogether at the user level?
> I know about
> Option "Emulate3Buttons"
> but I don't know how to apply it or whether it is possible because evdev scans
> for devices and creates InputDevices at runtime.

https://fedoraproject.org/wiki/Input_device_configuration
please don't hijack this bugreport for other issues though.
Comment 16 Kiri 2009-10-13 00:37:11 UTC
Perhaps due to my broken hardware, I must be a great candidate to test this bug.  I had crashes on average about every 3 to 4 hours on average. After a recent update of Xorg, the crashes diminished and GUI performance diminished.  Since applying the patch, I've had no crash.

>please don't hijack this bugreport for other issues though.

No need to worry that, in this case.  The disabling of middle button emulation would be a workaround for those afflicted by this issue.

Congratulations, it looks like you have found and fixed the bug!
Comment 17 Peter Hutterer 2009-10-13 01:39:19 UTC
Thanks for testing, pushed as f2dc0681febd297d95dae7c9e3ae19b771af8420. Will be in 2.3.0 shortly.