Bug 19086

Summary: Enter/Leave model sends wrong events
Product: xorg Reporter: Ari Entlich <lmage11>
Component: Server/Input/CoreAssignee: Peter Hutterer <peter.hutterer>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: brian, colin, daniel, keithp, lmage11, me, peter.saaf
Version: git   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 17452    
Attachments:
Description Flags
0001-dix-don-t-set-the-child-window-for-non-virtual-Ente.patch
none
0001-dix-re-implement-enter-leave-model.patch none

Description Ari Entlich 2008-12-14 19:39:09 UTC
Sorry for the terrible summary, but I really don't know how else to describe this problem. It's manifesting in two ways that I can see:

1. In Firefox, sometimes moving the mouse over a link only changes the cursor for the fraction of a second in which the cursor is actually moving. When the cursor is not moving and it's on a link, it just uses the normal cursor image. This doesn't always happen, and I don't know what triggers it.

2. My fluxbox toolbar (which is set up to auto-hide) only shows up if the mouse is all the way at the bottom. If I move the mouse to the bottom so that the toolbar shows up and then move it up a little so as to click something on the toolbar, the toolbar disappears. This seems to always happen. If I move my mouse to a certain place (very close to the bottom of the screen, but not right at the bottom), I can get the toolbar to rapidly show and hide itself without moving the cursor.

I am using server 1.6 rc3. Neither of these problems existed when I was using 1.5.x. I do not actually have any proof other than that principle (can't remember who it's named after) that says that many problems are usually caused by just a few coding errors. And also they both have to do with input.
Comment 1 Eric Anholt 2008-12-14 19:41:25 UTC
I see 1) as well, and have been for about 2-3 weeks I'd say.  Meant to bisect, but I suck.
Comment 2 Ari Entlich 2008-12-14 19:45:16 UTC
Yeah, I suck too. :-/

I wish it was possible to integrate git-bisect with portage. I'd be much more willing to do it. As it is, however, if you want a bisect from me I think you'll have to wait until I have some free time to figure out how the hell I would even do a bisect without overriding portage in an overly damaging way.
Comment 3 Ari Entlich 2008-12-14 19:52:16 UTC
Errm, I meant "I don't actually have any proof that these two problems are related other than that principle (can't remember who it's named after) that says that many problems are usually caused by just a few coding errors". Sorry for the noise.
Comment 4 Peter Hutterer 2008-12-14 22:00:00 UTC
On Sun, Dec 14, 2008 at 07:39:10PM -0800, bugzilla-daemon@freedesktop.org wrote:
> 1. In Firefox, sometimes moving the mouse over a link only changes the cursor
> for the fraction of a second in which the cursor is actually moving. When the
> cursor is not moving and it's on a link, it just uses the normal cursor image.
> This doesn't always happen, and I don't know what triggers it.

verified, reproducable. running firefox through xmond reveals that it's
usually some order of:

<- MotionEvent [event=0xABCD]
-> ChangeWindowAttributes [link cursor]
-> TranslateCoordinates [0xABCD, 0, 0]
some more of the TC request, then some create pixmap/copy area/free pixmap 
-> ChangeWindowAttributes [normal cursor]
-> TranslateCoordinates [0xABCD, 0, 0]

I have no idea why or what that is supposed to do. First one to bisect wins.
Comment 5 Peter Hutterer 2008-12-14 23:15:10 UTC
Still bisecting, last BAD version tested:

commit f3edc1fb0210149f35eab4e413700b5c4ac48214
Author: Keith Packard <keithp@keithp.com>
Date:   Tue Nov 25 23:15:35 2008 -0800

    New version of dolt

I haven't found a good version yet. So if you want to start bisecting, start
with a commit earlier than that.
Comment 6 Peter Hutterer 2008-12-15 16:40:44 UTC
Seems to be caused by the new enter/leave model added with the patches 5e48f5e2dd2dec7cfd1fa40b61e25123dfca515e through to 3e6da1636093d7dc98baac40544c0b0fb7fd8aec.

Making this a 1.6 blocker.
Comment 7 Peter Hutterer 2008-12-15 19:31:44 UTC
Created attachment 21191 [details] [review]
0001-dix-don-t-set-the-child-window-for-non-virtual-Ente.patch

This patch should fix the issue. Problem was a wrong implementation of the protocol spec. The thing that triggered it was a LeaveNotify event with detail NotifyInferior that had the child window set to a value other than None. This again triggered something internal in firefox and it would keep changing the cursor window back to the normal one instead of the link cursor.

Reproduceble testcase: open firefox, move into main window, then move up into menu bar, back into main window and hover over a link. The cursor immediately changes back to the normal cursor.

The patch sets the "child" field to None for all events with NotifyInferior. Only NotifyVirtual may have the "child" field set to a window.
Comment 8 Ari Entlich 2008-12-15 20:05:57 UTC
From the (admittedly limited) testing I've done with this patch, I think I can confirm that it fixes both of my issues. Thanks!
Comment 9 Keith Packard 2008-12-15 22:47:55 UTC
Reading through the protocol spec reminds me that all LeaveNotify events are generated before any EnterNotify events for the same action. The comments in enterleave.c don't make it clear if this holds for the new model.
Comment 10 Peter Hutterer 2008-12-15 22:55:08 UTC
It holds for the new model, but only in a localised fashion. The extreme case
for a nonlinear move from A-B may cause three different Enter/Leave runs.
The pointer simultaneously appears to move from A to child(A), Ancestor(A,B)
to Parent(B) and child(B) to B.

For each of the three, the LeaveNotifies are sent before the EnterNotifies.
Comment 11 Peter Hutterer 2008-12-16 20:33:39 UTC
See also: http://lists.freedesktop.org/archives/xorg/2008-December/041559.html
Comment 12 Peter Hutterer 2008-12-22 03:24:14 UTC
Changing title to describe the bug better.
Comment 13 Colin Guthrie 2008-12-26 07:12:16 UTC
ACK on the patch fixing things for me.
Comment 14 Ari Entlich 2008-12-29 15:51:55 UTC
Is a fix for this going to be committed anytime soon...?
Comment 15 Peter Hutterer 2008-12-30 15:41:59 UTC
(In reply to comment #14)
> Is a fix for this going to be committed anytime soon...?

We're still waiting for the focus model to be implemented fully. It's mostly there, but I'm missing a few corner cases. This patch is just a quickfix that doesn't fix the root cause of the problem.
Comment 16 Eric Appleman 2009-01-04 10:29:08 UTC
This bug also affects the autocomplete function of the Firefox address bar.
Comment 17 Peter Hutterer 2009-01-04 20:31:30 UTC
Created attachment 21665 [details] [review]
0001-dix-re-implement-enter-leave-model.patch

This is the reimplemented enter/leave model that fixes this bug. I haven't found any issues with it yet, so please give this a good test and report back. I intend to push this ASAP.
Comment 18 Colin Guthrie 2009-01-05 01:36:54 UTC
I presume this patch is for master rather than the 1.6 branch? It certainly doesn't apply cleanly on my 1.6 clone. Should this be ported to 1.6 for inclusion there? (I need the previous patch so I presume this is to be the case?) Should I just try and merge it manually or should I also cherry-pick the master changes to enterleave?

Essentially this patch does not apply to 1.6. due to the changes in:

commit aba1cbaadcde50a7a25f8aee06b66eec67a9145e
Author: Peter Hutterer <peter.hutterer@redhat.com>
Date:   Fri Nov 28 09:19:49 2008 +1000

    dix: No DeviceEnterLeave events in server 1.6

Reverting the enterleave.c changes of this commit makes this patch apply cleanly, but the whole commit itself does not revert cleanly (and I have no idea of the consequences of reverting it anyway!!).

Thanks for any info you can supply to allow me to test this further :)
Comment 19 Peter Hutterer 2009-01-05 03:20:00 UTC
> I presume this patch is for master rather than the 1.6 branch? 

correct. I haven't started the backport to 1.6 yet, but it should be 'trivial'.
Basically anything that does DeviceEnterLeaveEvents needs to be removed.
You're probably best of applying the patch to master and then cherry-picking
it over, git-am will refuse to apply it.
Comment 20 Peter Hutterer 2009-01-07 20:52:52 UTC
Pushed to master, see the patches leading up to http://cgit.freedesktop.org/xorg/xserver/commit?id=eb2d7b3d700952ba88c77deacf687b251300e660

Comment 21 Eric Appleman 2009-01-10 12:52:30 UTC
I think I may have found a related bug regarding autocomplete:

https://bugs.launchpad.net/ubuntu/+source/firefox/+bug/315907
Comment 22 Colin Guthrie 2009-01-10 13:09:48 UTC
(In reply to comment #21)
> I think I may have found a related bug regarding autocomplete:
> 
> https://bugs.launchpad.net/ubuntu/+source/firefox/+bug/315907
> 

While the description is a bit light and I'm not 100% sure I'd doing it right, I cannot seem to replicate this issue using the xserver-1.6-enterleave branch. I startup firefox and I can use the bar to search in my history without any problems.
Comment 23 Peter Hutterer 2009-01-11 23:03:32 UTC
Patches available for 1.6 now, see
http://lists.freedesktop.org/archives/xorg/2009-January/042308.html

And I'm so bold to claim this is fixed now.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.