Bug 56578 - race condition with active/passive grabs when opening menus with touch
Summary: race condition with active/passive grabs when opening menus with touch
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/Input/Core (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on: 56557
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-30 10:01 UTC by Timo Aaltonen
Modified: 2013-08-12 21:13 UTC (History)
12 users (show)

See Also:
i915 platform:
i915 features:


Attachments
log output of the "repeated tapping on ubuntu logo launcher icon" use case (479.08 KB, text/plain)
2012-11-14 12:04 UTC, Daniel d'Andrada
no flags Details
log output of use case with patches from bug 56557 applied (28.96 KB, text/plain)
2012-11-22 12:04 UTC, Daniel d'Andrada
no flags Details
log output of use case with patches from Comment 7 also applied (29.09 KB, text/plain)
2012-11-27 12:32 UTC, Daniel d'Andrada
no flags Details
evemu-record from the touchscreen (3.34 KB, text/plain)
2013-02-14 22:24 UTC, Timo Aaltonen
no flags Details
backtrace (32.49 KB, text/plain)
2013-03-06 09:11 UTC, Timo Aaltonen
no flags Details
valgrind spam that occurs when following tjaalton's instructions (7.79 KB, text/plain)
2013-03-06 17:50 UTC, Maarten Lankhorst
no flags Details
backtrace (16.51 KB, text/plain)
2013-03-27 10:37 UTC, Timo Aaltonen
no flags Details
Call ProcessTouchOwnershipEvent directly (3.02 KB, patch)
2013-04-08 17:42 UTC, Maarten Lankhorst
no flags Details | Splinter Review
touch-fix.patch (1.99 KB, patch)
2013-04-17 06:54 UTC, Till Kamppeter
no flags Details | Splinter Review
touch-fix.patch (2.95 KB, patch)
2013-04-17 08:07 UTC, Till Kamppeter
no flags Details | Splinter Review
/etc/X11/X-valgrind (285 bytes, text/plain)
2013-04-25 13:41 UTC, Maarten Lankhorst
no flags Details
Xorg-valgrind.till-twist (1.46 MB, text/plain)
2013-04-25 14:59 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist (2.59 MB, text/plain)
2013-04-25 15:09 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist.gz (252.19 KB, application/gzip)
2013-04-25 17:22 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist.gz (408.90 KB, application/gzip)
2013-04-25 19:28 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist.gz (527.47 KB, application/gzip)
2013-04-25 19:39 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist.gz (1.38 MB, text/plain)
2013-04-30 11:25 UTC, Till Kamppeter
no flags Details
Xorg-valgrind.till-twist.gz (1.50 MB, text/plain)
2013-04-30 11:30 UTC, Till Kamppeter
no flags Details
nexus valgrind log for latest attempt (177.94 KB, text/plain)
2013-05-03 06:40 UTC, Maarten Lankhorst
no flags Details

Description Timo Aaltonen 2012-10-30 10:01:15 UTC
I'm able to "lock up" the Unity session by opening menus quickly by using a touchscreen. Seems as if there's a grab active. I can see the tooltips from launcher icons, interact with focused apps, but that's it.

Can't reproduce with plain metacity, because the menus open so quickly with it, whereas with Unity on this hw the effects slow it down so that the race is hit.

Tried several of the recent patches on top of 1.13, but they haven't helped. Now I see there are newer patches available. I'll give them a try. Filed this one for tracking this particular issue.
Comment 1 Timo Aaltonen 2012-10-31 16:11:15 UTC
tried patches from 56558 and 55738, also "Sync TouchListener memory.." from Carlos Garnacho, didn't help.
Comment 2 Daniel d'Andrada 2012-11-14 12:03:34 UTC
I just repeatedly tap on the top-most icon (the one which has the Ubuntu logo) of Ubuntu's launcher in a touchscreen. Those taps alternately open and close the dash (a fullscreen window that shows icons for applications, media and other files). Eventually those taps stop having any effect. I.e., the launcher no longer gets ButtonPress and ButtonRelease events out of them.

I've added a wealth of logging (see xorg.log attachment) to try to understand what's happening on the server. From looking at it could see the following:
From touches 2 to 26, launcher is the first window in the list of listeners. From touch 27 onwards, the root window is the first one. Problem is, from touch 27 onwards, xserver fails to pass the touch ownership down to the launcher window because there's always an older pointer-emulated touch (touch 26) lying around which it apparently can't get rid of (i.e. properly process).
Comment 3 Daniel d'Andrada 2012-11-14 12:04:35 UTC
Created attachment 70064 [details]
log output of the "repeated tapping on ubuntu logo launcher icon" use case
Comment 4 Peter Hutterer 2012-11-19 06:05:14 UTC
Do cross-check with Bug 56557 as well, this can cause issues if any grabs are activated on the root window and I wonder if that influences the behaviour here
Comment 5 Daniel d'Andrada 2012-11-22 11:28:57 UTC
(In reply to comment #4)
> Do cross-check with Bug 56557 as well, this can cause issues if any grabs
> are activated on the root window and I wonder if that influences the
> behaviour here

Yes, they are at least closely related (most likely have the same cause) as a pointer-emulated touch gets "stuck" because of failed resource lookups in RetrieveTouchDeliveryData() as well.
Comment 6 Daniel d'Andrada 2012-11-22 12:04:17 UTC
Created attachment 70422 [details]
log output of use case with patches from bug 56557 applied

With the 4 patches mentioned in bug 56557 applied (comments 3 and 4), the bug (missing ButtonPress and ButtonRelease events) manifests itself already on the second tap on the touchscreen.

Again, due to a failure in RetrieveTouchDeliveryData()
Comment 7 Peter Hutterer 2012-11-27 01:11:21 UTC
New set of patches, please try those on top of the current set you already tested.

http://patchwork.freedesktop.org/patch/12519/
http://patchwork.freedesktop.org/patch/12520/
http://patchwork.freedesktop.org/patch/12521/
http://patchwork.freedesktop.org/patch/12522/
Comment 8 Daniel d'Andrada 2012-11-27 12:32:11 UTC
Created attachment 70656 [details]
log output of use case with patches from Comment 7 also applied

This is the log output I get with this new set of patches (from Comment 7) applied on top of those mentioned in Comment 6.

Again, the same problem.

The first tap on the icon with the ubuntu logo in the launcher (top left corner of the screen) works fine and displays the dash (a fullscreen window showing application icons, etc). The launcher now has a active pointer grab.

Upon the second tap on the ubuntu icon, xserver fails to deliver events to that listener (laucnher's active pointer grab) because the corresponding RetrieveTouchDeliveryData() call fails. A snippet from the log:

"""
(II)     TouchBeginDDXTouch: ddx id 0, touch 2 - returning with emulate pointer == 1
[  2859.473] (II)   ProcessTouchEvent: TouchBegin, master pointer, touch 2
 ...
[  2859.474] (II)       RetrieveTouchDeliveryData: listener(window=launcher, listener=1105199104, type=pointer_grab, state=begin, level=core)
[  2859.474] (II)         dixLookupClient: failed! - rid & SERVER_BIT
[  2859.474] (II)       - Not delivering to listener 1105199104 because his delivery data couldn't be retrieved.
"""
Comment 9 John Randolph 2013-01-31 16:36:15 UTC
We are also experiencing this bug with other touch screen software, not Unity related. The underlying X problem seems to be identical. Has a solution been found?
Comment 10 Timo Aaltonen 2013-02-11 11:49:15 UTC
Nope, the bug is still there. Rasterman reproduced it with E17 and commented on the downstream bug:

https://bugs.launchpad.net/ubuntu-nexus7/+bug/1068994/comments/24
Comment 11 Peter Hutterer 2013-02-13 23:51:48 UTC
can you test this branch here please? http://cgit.freedesktop.org/~whot/xserver/log/?h=touch-grab-race-condition-56578

Last 5 commits (currently), starting with 2cd9c4f709f105b7a7faf31b8c10993d0949563c
Comment 12 Timo Aaltonen 2013-02-14 13:17:22 UTC
unfortunately still able to reproduce it :/

I needed these commits on top of 1.13.2 to be able to compile with the new patches:
cc79107a5b60d2926e16ddbee04149e8d5acc969
fe59774c55e5d423633405e0869c22f4ce382548
91ab237358c6e33da854914d3de493a9cbea7637
9ad0fdb135a1c336771aee1f6eab75a6ad874aff
Comment 13 Peter Hutterer 2013-02-14 22:08:49 UTC
You'll need all of http://cgit.freedesktop.org/~whot/xserver/log/?h=server-1.13-branch, at the least. I haven't tested this on 1.13.x at all, purely working from git master for now.
Comment 14 Peter Hutterer 2013-02-14 22:09:20 UTC
Sorry, to clarify: you need that 1.13 branch linked above AND the patches from Comment 11
Comment 15 Timo Aaltonen 2013-02-14 22:24:09 UTC
Created attachment 74845 [details]
evemu-record from the touchscreen

attached the evemu dump from reproducing the bug by hitting the unity indicators quickly a couple of times.

I'll try the more complete 1.13 build next.
Comment 16 Peter Hutterer 2013-02-25 05:11:00 UTC
Ok, analysis of the bug as follows. To trigger this bug, we need the following client stack:
* touch client with a passive touch grab
* core client with a passive button grab in GrabmodeSync
* optional: core client with button mask on window
The touch client must reject the touch.

As the touch grab activates, all events are sent to the touch client, and stored in the touch event history. When the client rejects, the events are replayed on the next client.

The replayed TouchBegin will trigger the core passive grab, and switch the device's processInputProc to EnqueueEvent().
BUG 1: because touch event history replaying calls DeliverTouchEvents directly, EnqueueEvent is side-stepped and no events end up in the sync'd queue. Later, when the client calls XAllowEvents no events are there for syncing, ComputeFreezes() exits early and the emulated motion/release events are not sent to the client.

Fixing that is possible so that EnqueueEvent is honoured. Tricky though, because it will have a number of side-effects, see below.

BUG 2: because the TouchEnd never ends up in the history (by design) no release event ends up in the queue. So when replaying, the emulated button release is missing. Not sure yet how to fix this.

BUG3: If there's the optional third client, it's implicit passive grab currently does not get released. That's the easiest one to fix.

Side effects of the first bug:
If we use EnqueueEvent() for event history replaying, we will replay touch events into the sync buffer, but not actually process them. If there is at least one touch client below the client with the sync passive core grab, it cannot get touch events until the grabbing client calls XAllowEvents. If that touch client has the ownership mask set, that behaviour is against the protocol spec.

Coincidentally, this bug already exists anyway, it's just gone unnoticed so far because touch clients appear to be generally above the normal clients.

To be compliant with the touch specs, we need to wrap EnqueueEvent to still handle touch events for clients with the ownership mask even if the device is currently synced.
Comment 17 Peter Hutterer 2013-03-01 06:51:44 UTC
Branch available for testing here. I think this fixes the issue but I've been unsuccessful getting this backported to a 1.13 ubuntu server.

http://cgit.freedesktop.org/~whot/xserver/log/?h=touch-grab-race-condition-56578-v2

If you can test this, that'd be much appreciated.
Comment 18 John Randolph 2013-03-04 20:35:25 UTC
Hi Peter, I think your recent patches do fix the issue.

I compiled your server and a fresh xinput evdev 2.7.3. I confirmed TouchBegin TouchEnd were being sent with a brief xinput test-xi2 test.

# xdpyinfo  |grep -E '(vendor|version)'
version number:    11.0
vendor string:    The X.Org Foundation
vendor release number:    11399902
X.Org version: 1.13.99.902

My usual scenario to experience this problem is:
  run Chrome
  xwininfo  [tap root screen, get window id of chrome window]
  xev -id 0x.... [use window id of chrome window]
  tap screen a few times to see xev notify events
  ctrl-C
  on screen, touch a UI button
  the press activates the UI button
  screen switches to new page  <-- ButtonRelease is dropped somewhere from here
  the new UI button underlying where my finger just pressed is stuck down 
  ^--- to here

With these same testing steps above I cannot get a stuck button on your new xserver branch. It seems that the ButtonRelease event arrives correctly.
Comment 20 Timo Aaltonen 2013-03-05 15:12:49 UTC
Ok I've tested them as well by building 1.14rc minus the video abi changing stuff (and commits on top of them), and added the touch branch. This allowed me to test on the nexus7 & tegra3 blob.

Looks like it's much better now, although sometimes the touch appears to get somewhat hung but can recover from it later on, and when this happens also generates messages like

[  5101.196] [Xi] Too many valuators reported for device 'Virtual core pointer'. Ignoring event.

on the logfile. The buffered actions prior to the hang are replayed after waiting for a while. At this stage it's quite easy to crash the server.
Comment 21 Peter Hutterer 2013-03-05 22:11:49 UTC
do you have a good backtrace for the crashes? random, or always the same spot? Is it regular in response to some interaction? can it be caused by the backports?
Comment 22 Timo Aaltonen 2013-03-06 09:11:09 UTC
Created attachment 76003 [details]
backtrace

Here's the backtrace, seems to be the same every time.

Way to reproduce here:

1. open an app, so there's a window around
2. attach an external pointer device
3. tailf the X logfile
4. hit the panel indicators frantically with the touchscreen, until the touch input is locked
5. move the window with the other pointer device
6. see how some "[Xi] ..." messages appear on the logfile
7. repeat the steps until..
8. .. when the touch input is locked the logfile will get these Xi messages after every touch.. when this happens keep hitting the screen until it crashes, can take a couple of minutes :)

so, it's only after using the other pointer device for a grab when the touch input grab is released. Also, while in step 8 I noticed that the multitouch gestures of unity seemed to work, while the panel menus failed to react. Also, Onboard seemed to work as well. So, while locked I can drag a window with a three-touch gesture but not by a single touch drag from the titlebar.

Not sure what backports you mean, this is 1.14 with your branch, but ajax's video abi commits reverted so the blob (and thus unity) work.
Comment 23 Maarten Lankhorst 2013-03-06 17:50:42 UTC
Created attachment 76034 [details]
valgrind spam that occurs when following tjaalton's instructions

I can reproduce this on x1.14 with my macbook pro in the manner tjaalton described. It didn't need the video abi revert.
Comment 24 Peter Hutterer 2013-03-07 06:09:31 UTC
did you rebuild the drivers too? just wondering, because I used to get a similar crash on my backports but only when running against the system drivers, not against the upstream ones.
Comment 25 Maarten Lankhorst 2013-03-08 05:40:41 UTC
I still crashed even if I rebuilt the drivers against the patched xorg-server, so it's not that.
Comment 26 Maarten Lankhorst 2013-03-08 06:41:43 UTC
It seems that the ubuntu patches for synaptics trigger it, most likely not these:

02-do-not-use-synaptics-for-keyboards.patch
- makes synaptics no longer match input.keyboard

101_resolution_detect_option.patch
- Add resolutiondetect atom and config option, to add a way to disable autodetect

115_evdev_only.patch
- uncomment 50-synaptics.conf

118_quell_error_msg.patch
- only affects tools
124_syndaemon_events.patch
- only affects syndaemon


But these change some things around:
103_enable_cornertapping.patch
- sets RTCornerButton default to 2, and RBCornerButton default to 3

104_always_enable_tapping.patch
- always sets up tap buttons in set_default_parameters

106_always_enable_vert_edge_scroll.patch
- guess :-)

128_disable_three_click_action.patch
129_disable_three_touch_tap.patch
- both disable 3 touch actions, to make three-touch gestures work

Presumably one of those default tweaks would cause it. I'll try to nail it down.
Comment 27 Timo Aaltonen 2013-03-08 07:08:09 UTC
well I'm not using synaptics, so it's not the same crasher then?
Comment 28 Maarten Lankhorst 2013-03-08 08:03:38 UTC
crashes unpatched too, after all :)

I guess I didn't hammer enough on the touchpad like a 3 year old
Comment 29 Maarten Lankhorst 2013-03-08 12:56:16 UTC
The crash is in xorg-server by the way, not in the driver, and seems to involve memory freed in xorg-server. It just seems more likely that it involves multitouch handling in xorg-server in general, and is not a bug in a specific driver.

Either that or there are 2 different bugs in evdev and synaptics that both cause a similar backtrace in xorg-server, this somehow seems less likely to me. :)
Comment 30 Peter Hutterer 2013-03-08 23:30:47 UTC
can you bisect the server then? I honestly don't know where it triggers and given that it's 19 patches it'll be easier to bisect than figure it out otherwise. 

fwiw, I've pushed the rebased branch (only a few squashes and reshuffling), please make sure you pull first.
Comment 31 Maarten Lankhorst 2013-03-10 21:14:13 UTC
for reference, 1.13 server branch (at time of writing 1.13.3 release) crashes just as hard.
Comment 32 Daniel Drake 2013-03-20 16:41:30 UTC
Thanks a lot for the hard work here. We see the same issue in Sugar, the UI is basically unusable with touch as we have a "global" grab.

I tested the patches from comment 19 against xserver-1.14.0, they do solve the problem, and I cannot see any new issues introduced by them.

I also tested to 1.13.3. In order to do that I first had to backport a few commits:
* Update the MD's position when a touch event is received
* Don't use GetTouchEvents when replaying events
* Don't use GetTouchEvents in EmitTouchEnd

Then I added the patches from comment #19, and things are now working equally well there.
Comment 33 Timo Aaltonen 2013-03-27 10:37:52 UTC
Created attachment 77097 [details]
backtrace

the backport seems incomplete, since it's trivial to crash the server with unity by switching between opening the dash or indicator menus
Comment 34 Daniel Drake 2013-04-02 14:50:44 UTC
The backported patches on 1.13.3 (from comment #32) have now been in OLPC's development builds for over a week and we haven't seen any adverse effects.

I've also done some testing on 1.14.0. I can make this crash (with no backtrace) simply by going a bit crazy on the touchscreen for a few minutes, both before and after this patch series. A problem for another day.

Based on this I would vote for going ahead with the merge of this patch series into master.

I also found a related bug with both 1.13.3 and 1.14.0 (both before and after these patches), and posted a patch here:
http://lists.x.org/archives/xorg-devel/2013-April/035878.html
Comment 35 Maarten Lankhorst 2013-04-08 17:41:40 UTC
first valgrind error is on 

int emulate_pointer = ! !(ev->device_event.flags & TOUCH_POINTER_EMULATED);

So I guess it's safe to assume that ev is garbage..
Other writes seem to be related to ev too, judging from the valgrind output I guess random stuff gets overwritten.

Looking at SetTapState output:

0 -> 1
1 -> 10
moving state stuff
10 -> 2
2 -> 10
moving state stuff
10 -> 2

and then a few more 2 -> 10 and 10 -> 2 with moves until valgrinds starts complaining and xserver starts crashing:

(II) SetTapState - 10 -> 2 (millis:3928387395)
==25788== Invalid read of size 4
==25788==    at 0x24236E: ProcessOtherEvent (exevents.c:1519)
==25788==    by 0x264CAE: ProcessPointerEvent (xkbAccessX.c:751)
==25788==    by 0x166641: PlayReleasedEvents (events.c:1217)
==25788==    by 0x16DED4: ComputeFreezes (events.c:1297)
==25788==    by 0x16E2E3: AllowSome (events.c:1725)
==25788==    by 0x16E495: ProcAllowEvents (events.c:1785)
==25788==    by 0x15DC45: Dispatch (dispatch.c:432)
==25788==    by 0x14C5B9: main (main.c:295)
==25788==  Address 0x122336b0 is 16 bytes before a block of size 152 free'd
==25788==    at 0x4C2BA6C: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25788==    by 0x806D84A: sna_mode_wakeup (sna_display.c:3500)
==25788==    by 0x161F3B: WakeupHandler (dixutils.c:426)
==25788==    by 0x2AF6E3: WaitForSomething (WaitFor.c:224)
==25788==    by 0x15D9A0: Dispatch (dispatch.c:361)
==25788==    by 0x14C5B9: main (main.c:295)

This is with the patches from comment #19 + daniel drake's patch

Digging more, looking up the InternalEvent struct..

    int emulate_pointer = ! !(ev->device_event.flags & TOUCH_POINTER_EMULATED);

Now this is a function that is looking verrrrrrrry suspicious for type == ET_TouchOwnership..

I think it would make sense to have ET_TouchOwnership handled directly by ProcessTouchOwnershipEvent, rather than through ProcessTouchEvent. Patch attached below..
Comment 36 Maarten Lankhorst 2013-04-08 17:42:21 UTC
Created attachment 77621 [details] [review]
Call ProcessTouchOwnershipEvent directly
Comment 37 Maarten Lankhorst 2013-04-11 09:54:42 UTC
Valgrind came up with this complaint on 1.13.3 with the backported patches:

==15921== Invalid read of size 4
==15921==    at 0x1D0A00: DeliverTouchEvents (exevents.c:1297)
==15921==    by 0x1D2589: ProcessOtherEvent (exevents.c:1611)
==15921==    by 0x1567C1: TouchEventHistoryReplay (touch.c:491)
==15921==    by 0x1D0EBB: TouchPuntToNextOwner (exevents.c:1120)
==15921==    by 0x1D11EB: TouchRejected (exevents.c:1196)
==15921==    by 0x1D28B5: ProcessOtherEvent (exevents.c:1223)
==15921==    by 0x1E7DAB: ProcessPointerEvent (xkbAccessX.c:751)
==15921==    by 0x204DC5: mieqProcessDeviceEvent (mieq.c:556)
==15921==    by 0x1570A7: TouchListenerAcceptReject (touch.c:1013)
==15921==    by 0x1D6AD3: ProcXIAllowEvents (xiallowev.c:128)
==15921==    by 0x1D2BD5: ProcIDispatch (extinit.c:406)
==15921==    by 0x13CC0D: Dispatch (dispatch.c:428)
==15921==  Address 0xc6a0bac is 4 bytes inside a block of size 68 free'd
==15921==    at 0x482E5B0: free (vg_replace_malloc.c:446)
==15921==    by 0x14C129: DeletePassiveGrab (grabs.c:336)
==15921==    by 0x1527FD: doFreeResource (resource.c:873)
==15921==    by 0x152F7F: FreeResource (resource.c:903)
==15921==    by 0x14C49F: DeletePassiveGrabFromList (grabs.c:686)
==15921==    by 0x144A7D: ProcUngrabButton (events.c:5640)
==15921==    by 0x13CC0D: Dispatch (dispatch.c:428)
==15921==    by 0x132035: main (main.c:298)

I picked the fixes from 57301 to 1.13 too: Xi: fix touch event selction conflicts (#57301), and the commit before that to make it apply.

This brings 1.13 dix and Xi to the 1.14 equivalent minus pointer barriers, as far as I can tell, but then I was getting the following segfault:

==1748== Invalid read of size 4
==1748==    at 0x4831DCC: memcpy (mc_replace_strmem.c:878)
==1748==    by 0x156959: TouchConvertToPointerEvent (touch.c:637)
==1748==    by 0x1D0FA3: DeliverTouchEmulatedEvent.isra.0.part.1 (exevents.c:1375)
==1748==    by 0x1D0C5F: DeliverTouchEvents (exevents.c:1920)
==1748==    by 0x1D25B1: ProcessOtherEvent (exevents.c:1611)
==1748==    by 0x1E7E03: ProcessPointerEvent (xkbAccessX.c:751)
==1748==    by 0x1423B1: PlayReleasedEvents (events.c:1214)
==1748==    by 0x146D13: ComputeFreezes (events.c:1294)
==1748==    by 0x146F6B: AllowSome (events.c:1722)
==1748==    by 0x1470BF: ProcAllowEvents (events.c:1785)
==1748==    by 0x13CC0D: Dispatch (dispatch.c:428)
==1748==    by 0x132035: main (main.c:298)
==1748==  Address 0xcaa1284 is 156 bytes inside a block of size 280,000 free'd
==1748==    at 0x482E5B0: free (vg_replace_malloc.c:446)
==1748==    by 0x1570E3: TouchListenerAcceptReject (touch.c:1015)
==1748==    by 0x146D6F: ComputeFreezes (events.c:1282)
==1748==    by 0x146F6B: AllowSome (events.c:1722)
==1748==    by 0x1470BF: ProcAllowEvents (events.c:1785)
==1748==    by 0x13CC0D: Dispatch (dispatch.c:428)
==1748==    by 0x132035: main (main.c:298)
Comment 38 Maarten Lankhorst 2013-04-11 11:15:39 UTC
Note: macbook pro (synaptics) seems to work just fine with the 1.13.3 backports, so it looks like it's a separate bug due to different behavior on a true touch device. The valgrind backtraces were on arm/tegra, which also enables a software keyboard.

The easiest way to crash on ubuntu's xserver on the tegra is by making sure valgrind is running with --free-fill=fe so the freed memory is always reset to an invalid value.
Comment 39 Till Kamppeter 2013-04-17 06:54:23 UTC
Created attachment 78124 [details] [review]
touch-fix.patch

I have partial (full?) success (on the Lenovo Thinkpad Twist, an Intel-based convertible, see also https://launchpad.net/bugs/1068994 and https://launchpad.net/bugs/1015183):

I have rebuilt the current Ubuntu Raring package of xorg-server (1.13.3-0ubuntu5) with the following two patches:

1. http://cgit.freedesktop.org/~whot/xserver/commit/?h=touch-grab-race-condition-56578-v2&id=0498a4f0e0b90a850df7022a3356f10adabff855

(found via comment #17)

2. http://lists.x.org/archives/xorg-devel/2013-April/035878.html

and after that clicking via touch screen on the Lenovo Thinkpad Twist works reliably. Only remaining (minor) problems are (but the touch click ability does not get lost by them):

a. In Chromium when you create a new tab, the new tab contains icons for web apps (at least the app store and perhaps some examples). These icons cannot be clicked by touch, only with a mouse. All the rest in Chromium is clickable by touch.

b. Touch clicks do not work in XBMC, but after using and leaving XBMC with an external mouse on the normal desktop touch-clicking works again.

These are probably separate bugs which got revealed by the now working touch click.

Complete patch for xorg-server is attached.
Comment 40 Till Kamppeter 2013-04-17 08:07:25 UTC
Created attachment 78125 [details] [review]
touch-fix.patch

Sorry, patch is not complete. Here is the correct one.
Comment 41 Peter Hutterer 2013-04-17 08:13:08 UTC
ok, thanks to Maarten's debugging we've found the issue. listener->grab is not copied but rather referenced, leaving the grab stale once it was deleted. Reproducible test case is simply:

XGrabButton()
pointer-emulating touch down
XUngrabButton()
trigger touch update/end

This doesn't necessarily crash, but once you run through valgrind to reset memory after freeing it we have a reliable crasher.
Comment 42 Till Kamppeter 2013-04-17 09:17:42 UTC
I have built xorg-server with my patch also on the Nexus 7 (armhf) now and it works perfectly there with the desktop and all applications, too, and on the Nexus 7 XBMC and Chromium's web apps work with touch.

It also seems to fix the Nexus 7.
Comment 43 Peter Hutterer 2013-04-18 04:57:51 UTC
ok, I'll be honest. this is a giant mess where we potentially access dangling pointers and sorting this out is nasty. my attempts to do so today have failed badly. fix will come, but not too soon I'm afraid
Comment 44 Daniel Drake 2013-04-19 18:36:55 UTC
Yes, I can see how time consuming this must be. Thanks for continuing to work on it, at OLPC we can promise you some testing once code is ready.

In the mean time I will add the latest 2 patches to our development builds for further testing:

Xi: Do not handle ET_TouchOwnership in ProcessTouchEvent
dix: copy event in TouchConvertToPointerEvent correctly
Comment 45 Peter Hutterer 2013-04-23 07:32:56 UTC
Please have a test of this branch here:
http://cgit.freedesktop.org/~whot/xserver/log/?h=touch-grab-race-condition-56578-v2

I'm not 100% sure yet if there's a memleak introduced - haven't done the required checks yet. but it fixes the crasher caused by the invalid memory dereference.
Comment 46 Till Kamppeter 2013-04-24 15:48:15 UTC
Peter, I have tested your new branch on the Lenovo Thinkpad Twist now. I do not get any crashes and left clicking by tapping is absolutely reliable for me. Right-clicking via onboard does not work for me though. If I activate the right-click mode and tap, the tap is interpreted as left click (right-click mode ignored). At least I do not get a stuck-left-button effect by the right click. I do not get any crash nore a stuck-button effect at all, independent what I am doing. What is missing now is a fix for the right click.
Comment 47 Daniel Drake 2013-04-24 20:44:22 UTC
Thanks for continuing to work on this.

I believe the touch-grab-race-condition-56578-v2 patch series so far creates a problem with mouse input. In Sugar's Paint application, I can't paint anything by moving the mouse around with the button held down.

Running xev, I can see that clicking and holding the mouse button doesn't actually trigger any events. Only when I release, ButtonPress and ButtonRelease appear in quick succession.

If nobody beats me to it, I'll bisect this later this week. Also, the above test was done on xserver-1.13.3, I should also test on a newer version to make sure there aren't any other factors at play.
Comment 48 Till Kamppeter 2013-04-24 21:18:56 UTC
Daniel, Peter, I am using the the full GIT branch touch-grab-race-condition-56578-v2 which is 1.14 and here I have no problem with Sugar's Paint application (rgbPaint, am I right?). I can paint both with an external Bluetooth mouse with the left button held down and with my finger on the touch screen of the Lenovo Thinkpad Twist.
Comment 49 Daniel Drake 2013-04-24 21:28:46 UTC
Thanks for testing. Sugar's paint app is http://activities.sugarlabs.org/en-US/sugar/addon/4082

It is probably more meaningful to do the xev test though. Click the mouse button and hold, you would expect a ButtonPress event to show immediately, but it doesn't. And do that under sugar, in case the global touch grabs are affecting things.
Comment 50 Till Kamppeter 2013-04-24 21:50:56 UTC
Daniel, on my 1.14 I do not see any problem, also when testing with xev. Both with the external mouse and my finger on the touch screen I see ButtonPress events when I press and hold the mouse button or when I put my finger onto the screen and I get ButtonRelease events when I release the mouse button or take my finger from the screen. This works all correctly for me.
Comment 51 Peter Hutterer 2013-04-25 03:05:03 UTC
Till, can you run this under valgrind please to make sure I didn't introduce any memory leaks?
Comment 52 Till Kamppeter 2013-04-25 06:55:28 UTC
Peter, how do I run the xorg server under Valgrind? I have a Ubuntu Raring system.
Comment 53 Till Kamppeter 2013-04-25 11:24:11 UTC
Another touch problem: If I run Chromium browser and try to drag and drop one of the tabs using the touch screen, the left button gets stuck down and it does not get even unstuck if I continue working with the external Bluetooth mouse. I can only kill the session.

It also happens sometimes that X crashes but without any message in /var/log/syslog.
Comment 54 Maarten Lankhorst 2013-04-25 13:41:36 UTC
Created attachment 78472 [details]
/etc/X11/X-valgrind

For valgrinding xserver you want to install the xserver-xorg-core-dbg package from the binary you generated, and also install xserver-xorg-input*dbg and xserver-xorg-video*dbg and valgrind

I enabled auto valgrinding by creating /etc/X11/X-valgrind with the contents of this adjustment, make the file executable and then point the /etc/X11/X symlink to it. It will append the log to /var/log/Xorg-valgrind.HOSTNAME, so if xserver crashes you'll get detailed information why. :-)
Comment 55 Till Kamppeter 2013-04-25 14:41:38 UTC
Also with 1.14 XBMC behaves as in comment #39, not reacting to touch clicks. Looking more deeply into XBMC's behavior, the mouse cursor is put into the lower right corner of the screen when touch-clicking an arbitrary place, perhaps all touch clicks are registered with the coordinates of the lower right corner.
Comment 56 Till Kamppeter 2013-04-25 14:45:51 UTC
I have set up running X under Valgrind now. I have installed

xserver-xorg-core-dbg
valgrind
xserver-xorg-video-intel-dbg
xserver-xorg-video-modesetting-dbg
xserver-xorg-input-evdev-dbg
xserver-xorg-input-synaptics-dbg
libdrm2-dbg
libdrm-intel1-dbg

ThenI have installed Maarten's script, made it executable, and linked it. After that I have restarted X via

sudo restart lightdm

X is mnuch slower now, probably due to Valgrind's work.
Comment 57 Till Kamppeter 2013-04-25 14:49:25 UTC
First observation under Valgrind:

onboard pops up when touch-clicking an input field, but onboard is non-functional. Independent whether I touch-click the keys or use my external mouse, the keys do not react. No changes of the key's color, no character appearing in the input field. Also right-clicking does not work as one cannot operate the right-click button.
Comment 58 Till Kamppeter 2013-04-25 14:59:21 UTC
Created attachment 78475 [details]
Xorg-valgrind.till-twist

My Valgrind log as of now.
Comment 59 Till Kamppeter 2013-04-25 15:07:45 UTC
Installed libunwind8-dbg to improve Valgrind log, then restarted lightdm, logged in, and now onboard works.
Comment 60 Till Kamppeter 2013-04-25 15:09:11 UTC
Created attachment 78476 [details]
Xorg-valgrind.till-twist

Update of Valgrind log.
Comment 61 Till Kamppeter 2013-04-25 15:31:59 UTC
I have more experience with the onboard-aided right click (same running under Val;grind or without Valgrind):

Touch-clicking the right-click key on onboard makes it turning grey. After that doing one touch click on the desktop background does nothing. A second touch click on the background makes the right-click menu open and onboard disappear.

Right-clicking in Chromium does not work. The second click only makes onboard disappear but does not pop up the right-click menu of Chromium.
Comment 62 Till Kamppeter 2013-04-25 15:34:20 UTC
Same with the double-click emulation button of onboard: It also executes the double-click only on the second touch click (tested with Nautilus).
Comment 63 Till Kamppeter 2013-04-25 17:22:39 UTC
Created attachment 78483 [details]
Xorg-valgrind.till-twist.gz

Finally I succeeded to make X crashing again, I opened several programs (Firefox, Chromium, Thunderbird, Calculator, digikam), did some clicks in them, and closed them again. Then I opened LibreOffice Writer via the Launcher and got a window asking to recover a previous document which was not correctly closed. I rejected and when I answered the question whether I really want to reject with "Yes", X crashed.

Valgrind log attached.
Comment 64 Till Kamppeter 2013-04-25 19:28:29 UTC
Created attachment 78484 [details]
Xorg-valgrind.till-twist.gz

With LibreOffice Writer I can reproduce the crash reliably. Right after login I touch-click its icon in the Launcher, get the dialog to recover the document of the previous session, I reject, and as soon as I click "Yes" to confirm, X crashes, and X crashes fast enough so that LibreOffice does not clean up the document which I have rejected. In the next session I will get asked again.

If you cannot reproduce the crash as you do not have a broken document, try starting a new document and then "kill -9" LibreOffice. On the next session it should ask you for recovering your document.
Comment 65 Till Kamppeter 2013-04-25 19:31:45 UTC
Note: In the last two comments (and also in my other tests), I did all operations by touch clicking (if not otherwise stated).
Comment 66 Till Kamppeter 2013-04-25 19:39:10 UTC
Created attachment 78485 [details]
Xorg-valgrind.till-twist.gz

X crashes as well if I do the described steps with LibreOffice using my external Bluetooth mouse for all clicks and not the touch screen.

Valgrind log attached.
Comment 67 Daniel Drake 2013-04-26 20:10:33 UTC
(In reply to comment #47)
> I believe the touch-grab-race-condition-56578-v2 patch series so far creates
> a problem with mouse input. In Sugar's Paint application, I can't paint
> anything by moving the mouse around with the button held down.
> 
> Running xev, I can see that clicking and holding the mouse button doesn't
> actually trigger any events. Only when I release, ButtonPress and
> ButtonRelease appear in quick succession.

I have reproduced this by checking out the git branch in question and building it directly, so it was not a side effect of my earlier attempt (above) where I had backported this to 1.13.3.

The problem can be reproduced very easily: xinit /usr/bin/xev (running over ssh from another machine, to be able to see stdout)

Move the mouse cursor to the top left (where the xev window is). Click and hold the mouse button, and keep holding. No output from xev. Now release the mouse button, ButtonPress and ButtonRelease arrive at the same time. No touch input is needed to see this problem.

A few churns of "git bisect" later I have tracked this down to:

3e1515898545b0ed9e1f0794800c07061c8c8039 is the first bad commit
commit 3e1515898545b0ed9e1f0794800c07061c8c8039
Author: Peter Hutterer <peter.hutterer@who-t.net>
Date:   Thu Apr 18 10:32:11 2013 +1000

    dix: drop DeviceIntRec's activeGrab struct
Comment 68 Till Kamppeter 2013-04-30 11:25:22 UTC
Created attachment 78643 [details]
Xorg-valgrind.till-twist.gz

Another crash, this time I was visiting http://www.tagesspiegel.de/ with the Chrome browser. As usual, Valgrind log attached.
Comment 69 Till Kamppeter 2013-04-30 11:30:34 UTC
Created attachment 78644 [details]
Xorg-valgrind.till-twist.gz

Another crash: Still visiting http://m.tagesspiegel.org/, watching one of the videos, tried to maximize the Chrome window -> X crashed. Valgrind log attached again.
Comment 70 Peter Hutterer 2013-05-03 06:23:37 UTC
the libreoffice hint helped a lot tracking this down. New branch posted (top commit b8a2de82e36dd922843618f15703113dd556b164 dix: fix cursor refcounting
). Please give this a test. looks like my test box here is happy and valgrind doesn't see any leaks (yet)
Comment 71 Maarten Lankhorst 2013-05-03 06:40:05 UTC
Created attachment 78801 [details]
nexus valgrind log for latest attempt

Still a bit buggy. On the nexus7 I can cause it to drop events in the same way still..

What I do is touch the ubuntu dash icon in upper left, then release finger and make a dragging motion with the dash icon. I'm not 100% sure if the touch was fully released, or it just stopped registering my finger. But this (still) results in the following spam from xserver:

[Xi] Virtual core pointer: Failed to get event 8 for touchpoint 1
[Xi] Virtual core pointer: Failed to get event 8 for touchpoint 1
[Xi] Virtual core pointer: Failed to get event 8 for touchpoint 2
source device 7: history size 100 overflowing for touch 12
(history size overflowing repeated a lot, for touch 12 and 13)

Stopping lightdm doesn't crash any more and shows no leak. Only thing that may or may not be relevant is a still reachable warning:

==3663== 16,384 bytes in 4 blocks are still reachable in loss record 245 of 246
==3663==    at 0x482D4B8: calloc (vg_replace_malloc.c:593)
==3663==    by 0x216F23: WriteToClient (io.c:1017)
==3663==    by 0x142667: WriteEventsToClient (events.c:5982)
==3663==    by 0x142747: TryClientEvents (events.c:1968)
==3663==    by 0x144905: DeliverEventToInputClients (events.c:2116)
==3663==    by 0x144A99: DeliverEventsToWindow (events.c:2151)
==3663==    by 0x144D51: ProcSendEvent (events.c:5411)
==3663==    by 0x13B9D5: Dispatch (dispatch.c:432)
==3663==    by 0x130D2F: main (main.c:295)

Full log for the session is attached as vg.nexus
Comment 72 Till Kamppeter 2013-05-03 10:13:33 UTC
Peter, I have tried your new snapshot (comment #70) and so far I did not get crashes. Touch operation without right-clicking works well for me now. The right-click emulation via Onboard is still broken, though.
Comment 73 Daniel Drake 2013-05-04 02:25:13 UTC
(In reply to comment #70)
> the libreoffice hint helped a lot tracking this down. New branch posted (top
> commit b8a2de82e36dd922843618f15703113dd556b164 dix: fix cursor refcounting
> ). Please give this a test. looks like my test box here is happy and
> valgrind doesn't see any leaks (yet)

I would like OLPC to help with this testing, but the xev problem in comment #67 is getting in our way. Have you had a chance to investigate this yet?
Comment 74 Peter Hutterer 2013-05-05 22:46:24 UTC
daniel - xev behaves normally for me in the last branch. is it still misbehaving for you?
Comment 75 Daniel Drake 2013-05-06 14:24:22 UTC
Yep, reproduced with HEAD b8a2de82e3, bisection identifies the first bad commit as 3e15158985.
Comment 76 Peter Hutterer 2013-05-08 01:01:25 UTC
tried to bisect this, but I can't see any difference in the xev output before or after that commit. Tested several revisions after (and 3e15158985) and xev works as expected.

fwiw, my test box here is Ubuntu 12.10 with the server branch above, rest as-is. mouse used is a trackpoint, which for all purposes looks like a mouse.

test case was xinit /usr/bin/xev -- /opt/xorg/bin/Xorg -retro, then clicking+dragging into the xev window. events as expected.
Comment 77 Peter Hutterer 2013-05-08 02:48:42 UTC
> ==3663== 16,384 bytes in 4 blocks are still reachable in loss record 245 of
> 246
> ==3663==    at 0x482D4B8: calloc (vg_replace_malloc.c:593)
> ==3663==    by 0x216F23: WriteToClient (io.c:1017)
> ==3663==    by 0x142667: WriteEventsToClient (events.c:5982)
> ==3663==    by 0x142747: TryClientEvents (events.c:1968)
> ==3663==    by 0x144905: DeliverEventToInputClients (events.c:2116)
> ==3663==    by 0x144A99: DeliverEventsToWindow (events.c:2151)
> ==3663==    by 0x144D51: ProcSendEvent (events.c:5411)
> ==3663==    by 0x13B9D5: Dispatch (dispatch.c:432)
> ==3663==    by 0x130D2F: main (main.c:295)

This appears to be present in 1.14.0, not introduced by this series.

I raise a white flag on the other issue though, like the bug Daniel sees I cannot reproduce it here.
Comment 78 Daniel Drake 2013-05-08 22:00:17 UTC
(In reply to comment #76)
> tried to bisect this, but I can't see any difference in the xev output
> before or after that commit. Tested several revisions after (and 3e15158985)
> and xev works as expected.

Thanks for testing - I have now looked closer.

The patch removes a field from struct _GrabInfoRec. That is an ABI change, what does it affect? It does seem to break stuff outside of the xserver according to my initial test.

If I readd the field, even though it is now unused, xev works again.
Comment 79 Peter Hutterer 2013-05-08 22:21:54 UTC
oh, right. sorry, I forgot to mention this - it is indeed a ABI break so you have to recompile the drivers (or add the now-unused field back in). Maarten, this could also be the reason for your bug?
Comment 80 Peter Hutterer 2013-05-08 23:22:23 UTC
Pushed the branch with a fix to keep the ABI, please test de12ce91d8e44ab9398e730b457e5abc8d1acbe6
Comment 81 Timo Aaltonen 2013-05-09 05:25:08 UTC
I built it, and changing between dash and indicators soon hangs with this on the log:

[  3110.957] (EE) BUG: triggered 'if (!pGrab)'
[  3110.957] (EE) BUG: ../../dix/grabs.c:258 in FreeGrab()
[  3110.957] (EE) 
[  3110.957] (EE) Backtrace:
[  3110.957] (EE) 

gdb doesn't give anything, just the usual WaitForSomething etc
Comment 82 Daniel Drake 2013-05-09 17:50:29 UTC
(In reply to comment #80)
> Pushed the branch with a fix to keep the ABI, please test
> de12ce91d8e44ab9398e730b457e5abc8d1acbe6

Built this and can't see any problems after a quick test. I'll ship this in upcoming OLPC development builds for wider testing.
Comment 83 paul 2013-05-10 10:19:03 UTC
I have a lenovo S10-3t with full keyboard, synaptics touchpad and cando 2 touch screen that I'd like to try this on. I have ubuntu 13.04 on it. What are the git commands to access de12ce91d8e44ab9398e730b457e5abc8d1acbe6 and does it just replace the xserver-xorg or do I have to rebuild the other xorg parts too?
Comment 84 paul 2013-05-10 18:03:58 UTC
Sorry, I found the files on the pages referenced above, so don't need any reply.
Comment 85 Maarten Lankhorst 2013-05-13 09:17:18 UTC
Is make check failing for anyone else with v3?

(EE) test device: not enough space for touch events (max 5 touchpoints). Dropping this event.
(EE) test device: not enough space for touch events (max 5 touchpoints). Dropping this event.
(EE) test device: not enough space for touch events (max 5 touchpoints). Dropping this event.
/bin/bash: line 5: 26164 Segmentation fault      MALLOC_PERTURB_=15 ${dir}$tst
FAIL: touch

Program received signal SIGSEGV, Segmentation fault.
TouchInitTouchPoint (t=t@entry=0x4196e950, v=0x0, index=index@entry=0) at ../../dix/touch.c:243
243         ti->valuators = valuator_mask_new(v->numAxes);
Comment 86 paul 2013-05-15 00:05:03 UTC
I still need help compiling the test branch of xserver on ubuntu 13.04. If I try to compile it, detailed here http://www.x.org/wiki/CompileXserverManually it fails with complaints of wrong versions of x11proto. But, I have verified that the correct packages are actually installed on my system. So, I tried using jhbuild which builds everything in your home directory, details here http://www.x.org/wiki/JhBuildInstructions But, when I launch the jhbuild version, it crashes because it doesn't include my synaptics touchpad or cando touch screen. It won't run without input devices. So, can you provide some insight as to how I can build and test this xserver? Since some of you are using ubuntu, perhaps more specific instructions would work for me.

TIA
Comment 87 Peter Hutterer 2013-05-15 00:07:50 UTC
(In reply to comment #85)
> Is make check failing for anyone else with v3?

caused by a patch merged into master (and thus picked up on v3), fix is here:
http://patchwork.freedesktop.org/patch/13687/
Comment 88 Maarten Lankhorst 2013-05-15 06:43:36 UTC
paul:

add-apt-repository ppa:canonical-x/x-staging
apt-get update
apt-get dist-upgrade
apt-get build-dep xorg-server

will get you 1.14 + necessary build dependencies. Copy the debian directory from xserver 1.14, and comment out each patch that fails to apply in debian/patches/series
Comment 89 John Faulkner 2013-05-15 11:54:15 UTC
(In reply to comment #88)

Hello.

I would like to provide testing for this bug if possible but I'm not exactly clued up on compiling xorg-server from scratch. I figure it could be useful to have a none-standard (ie not a laptop or tablet device) low-end hardware test case but if it's unlikely to be useful then please let me know. 

One thing I've noticed is that once this bug has triggered (rendering most GTK applications and the unity dash unusable), Nautilus continues to function normally with the touch screen. Can anyone else confirm this on a standard Ubuntu 13.04 installation?

Anyway, I can see the branch you're talking about and can clone the git repository no problem.

> Copy the debian directory from xserver 1.14, and comment out each patch that fails to apply in debian/patches/series

I'm not certain which directory / patches you're referring to here, could you point me in the right direction?

I can duplicate this bug every time with a custom application which uses a GtkToolPalette. It appears to trigger every time I tap a category which produces a smooth roll-out animation - the hardware is pretty low end so I suppose this additional load triggers a race condition? I can trigger the bug in other normal uses but this one is guaranteed every time.
Comment 90 Maarten Lankhorst 2013-05-15 12:30:28 UTC
grab http://people.canonical.com/~mlankhorst/xorg-server_1.14.1.orig.tar.gz
and xorg-server_1.14.1-0ubuntu0.3+1.15rc1+touch.diff.gz
Comment 91 John Faulkner 2013-05-15 14:07:53 UTC
(In reply to comment #90)

Thank you, Maarten. I can patch and compile that copy but for some reason I'm getting a compilation error with the de12ce91d8e44ab9398e730b457e5abc8d1acbe6 branch in /dix/window.c line 421-425:

> REGION_INIT(pScreen, &pWin->clipList, &box, 1);
> REGION_INIT(pScreen, &pWin->winSize, &box, 1);
> REGION_INIT(pScreen, &pWin->borderSize, &box, 1);
> REGION_INIT(pScreen, &pWin->borderClip, &box, 1);


> window.c:421:5: error: the comparison will always evaluate as ‘true’ for the
> address of ‘box’ will never be NULL [-Werror=address]

Any ideas?
Comment 92 Peter Hutterer 2013-05-15 22:01:45 UTC
sorry guys, please take the compilation errors to the list. This bug is confusing enough with >90 comments and I'd like to keep off-topic stuff to a minimum.

pushed a new version of the branch after fixing a cursor refcounting issue that crashed my server when dragging and email in thunderbird. new branch tip is 9a5ad65330693b3273972b63d10f2907d9ab954a. This one also includes the fix Daniel wrote originally to avoid stuck buttons (http://lists.x.org/archives/xorg-devel/2013-April/035878.html)
Comment 93 Maarten Lankhorst 2013-05-16 09:35:36 UTC
That fixed up the background corruptions and hangs on armhf/nexus 7, but I'm still seeing a stuck mouse button, and [ 77305.765] [Xi] Virtual core pointer: Failed to get event 8 for touchpoint 1.
Comment 94 Maarten Lankhorst 2013-05-16 09:39:28 UTC
nm, bg is still corrupt when running in valgrind :(
Comment 95 Peter Hutterer 2013-05-31 05:20:41 UTC
fwiw, the latest branch got merged into master. It's still buggy but an improvement over the previous state.

commit c76a1b343d6a56aa9529e87f0eda8d61355d562b
Merge: 891123c 9a5ad65
Author: Keith Packard <keithp@keithp.com>
Date:   Thu May 23 19:58:36 2013 -0600

    Merge remote-tracking branch 'whot/touch-grab-race-condition-56578-v3'
Comment 96 Daniel Drake 2013-05-31 20:00:56 UTC
Thanks for all your work on this. At OLPC we've been testing the branch but have been a couple of commits behind the tip. Anyway, I think its still worth contributing the test result: no problems seen.
Comment 97 Jesse Renyard 2013-06-07 19:00:46 UTC
I would like to be any help I can with this bug fix. I am able to test on an 18.5" Winmate M185D as well as a 10" Winmate device (W10ID3S-PCH1). I am currently running Unity 13.04 and can make any necessary changes to the system. Please let me know what I can do to test and how to do it. I feel a bit over my head, but am willing to learn in order to be helpful.
Comment 98 Cody Swanson 2013-07-17 04:41:59 UTC
Wondering if anything has been happening in a while...
Comment 99 Daniel Drake 2013-07-17 13:05:16 UTC
Peter fixed a load of stuff and it got merged in xserver master. Unfortunately there have not been any development releases of xserver master since that happened, but that will come in time.

If you are still seeing problems, and are definitely using xserver master, then I suggest explaining your problem here (if you are sure that you are seeing the same issue), or opening a new bug report (if it seems like your issue might be unrelated).
Comment 100 Maarten Lankhorst 2013-07-17 13:27:42 UTC
The fix for bug #66720 looks relevant, commit 8eeaa74bc241acb41f1d upstream, it seems something broke for me though, so I can't test it right now.
Comment 101 Maarten Lankhorst 2013-07-17 15:54:15 UTC
Nope, and I noticed a BUG on !pGrab in FreeGrab, I'll try it a bit more on monday.
Comment 102 Cody Swanson 2013-07-17 17:43:04 UTC
(In reply to comment #99)
> Peter fixed a load of stuff and it got merged in xserver master.
> Unfortunately there have not been any development releases of xserver master
> since that happened, but that will come in time.
> 
> If you are still seeing problems, and are definitely using xserver master,
> then I suggest explaining your problem here (if you are sure that you are
> seeing the same issue), or opening a new bug report (if it seems like your
> issue might be unrelated).

Thanks for the sumary!  I was just wondering.  Thanks!
Comment 103 Peter Hutterer 2013-07-17 21:59:21 UTC
(In reply to comment #101)
> Nope, and I noticed a BUG on !pGrab in FreeGrab, I'll try it a bit more on
> monday.

merged as 0e3be0b25fcfeff386bad132526352c2e45f1932 yesterday.


as for the rest, I really need something that's reproducible.
Comment 104 Maarten Lankhorst 2013-07-25 11:42:21 UTC
I think the changes to onboard to use xinput2 directly may have fixed the remaining issue I was having. When I checked out onboard from trunk and used it on my nexus7 things worked, and nothing got stuck.
Comment 105 Till Kamppeter 2013-07-29 10:23:50 UTC
Maarten, I have checked with the new onboard ("bzr branch lp:onboard") now on an up-to-date Saucy with xserver packages from the x-staging PPA and I do not get a stuck-mouse-button effect any more.
Comment 106 Till Kamppeter 2013-08-10 13:16:51 UTC
I did further testing over longer time and no stuck button. Its seems that with the current X from the x-staging PPA and the current Onboard from the onboard PPA the problem is solved.
Comment 107 Peter Hutterer 2013-08-12 21:13:37 UTC
Thanks for testing. I'm going to close this one as fixed since we definitely fixed quite a few bugs in this patch set. If there's something left please file a new bug so we can narrow down the new (old? :) issues.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.