Bug 24986 - X server crashes with Xinerama and "LeftOf" option
X server crashes with Xinerama and "LeftOf" option
Status: RESOLVED FIXED
Product: xorg
Classification: Unclassified
Component: Server/General
unspecified
Other All
: medium normal
Assigned To: Peter Hutterer
Xorg Project Team
: patch, regression
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-11-08 07:13 UTC by Jochen Keil
Modified: 2010-05-13 17:43 UTC (History)
11 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch for issue (517 bytes, patch)
2009-12-29 10:58 UTC, Tim Yamin
no flags Details | Splinter Review
0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch (1.08 KB, patch)
2010-03-07 22:34 UTC, Peter Hutterer
no flags Details | Splinter Review
0001-mi-ignore-fromDIX-argument-in-mieqSwitchScreen.patch (2.29 KB, patch)
2010-03-09 21:13 UTC, Peter Hutterer
no flags Details | Splinter Review
0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch (1.13 KB, patch)
2010-03-10 23:02 UTC, Tim Yamin
no flags Details | Splinter Review
dix: make DeviceEvent coordinates signed for Xinerama (1.46 KB, patch)
2010-05-02 23:48 UTC, Chris Humbert
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Jochen Keil 2009-11-08 07:13:19 UTC
This happens with xorg-server > 1.7.1-1,
Xinerama and
a screen which has Position set with LeftOf or absolute but not in order e.g.

Screen 0 "Screen0" 0 0
Screen 1 "Screen1" LeftOf "Screen0"

or

Screen 0 "Screen0" 1200 0
Screen 1 "Screen1" 0 0

Moving the mouse to screen1 will make the cursor jump around and after a while X crashes.
Downgrading to this X server helps:

X.Org X Server 1.6.3.901 (1.6.4 RC 1)
Release Date: 2009-8-25

# uname -a
Linux monolith 2.6.31-ARCH #1 SMP PREEMPT Fri Oct 23 10:03:24 CEST 2009 x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux

I don't think this nvidia/platform or distribution specific, since this only happens with Xinerama enabled. On disabling Xinerama the screen offset like above work.

You might find some more information here:
http://www.nvnews.net/vbulletin/showthread.php?p=2119559#post2119559
Comment 1 Lewis Cawthorne 2009-11-18 16:54:10 UTC
The person that submitted this report seems to have covered everything fine.  I am one of the participants in the nvidia forum post he linked to.  

I would like to add that I was able to recreate the issue with both the nvidia and Nouveau drivers.  

I am currently dealing with this simply by masking package updates to xorg and using xorg 1.6.3.901 on my main tower.
Comment 2 Jason Frisvold 2009-12-22 13:17:16 UTC
I am also experiencing this problem.  I was also able to reproduce this by placing my second monitor above or below the primary.
Comment 3 Jason Frisvold 2009-12-22 13:36:21 UTC
Did some quick testing.  This also exhibits the same behavior on my system:

    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" Above "Screen0"

or

    Screen      1  "Screen1" 0 0
    Screen      0  "Screen0" Below "Screen1"


Disabling Xinerama removes the symptoms.
Comment 4 Tim Yamin 2009-12-29 10:58:04 UTC
Created attachment 32354 [details] [review]
Patch for issue

Here's a description of what happens for me along with a patch that seems to fix my problem... My Xinerama setup is as follows:

    Screen      0  "Screen0"
    Screen      1  "Screen0 (2nd)" RightOf "Screen0"
    Screen      2  "Screen1" Below "Screen0"
    Screen      3  "Screen1 (2nd)" RightOf "Screen1"
    Screen      4  "Screen2" Below "Screen1"

Occasionally, X locks up during normal use and you see the mouse cursor "jumping around" the different screens. X is in a tight 100% CPU loop and you can't do anything other than kill it. The easiest way to reproduce the bug in about 30 seconds is to drag a window around between the screens like crazy...

The problem seems to be that there's some sort of nasty feedback loop that causes events to be fed into the event queue while the event queue is being processed:

(gdb) bt
#0  mieqEnqueue (pDev=0x85441c0, e=0x85787a8) at mieq.c:148
#1  0x0809dfc0 in miPointerWarpCursor (pDev=0x85441c0, pScreen=0x82d1a10, x=241, y=7) at mipointer.c:587
#2  0x08154faa in xf86WarpCursor (pDev=0x85441c0, pScreen=0x82d1a10, x=241, y=7) at xf86Cursor.c:473
#3  0x0809dcc3 in miPointerSetCursorPosition (pDev=0x85441c0, pScreen=0x82d1a10, x=241, y=7, generateEvent=1) at mipointer.c:239
#4  0x08109e21 in AnimCurSetCursorPosition (pDev=0x85441c0, pScreen=0x82d1a10, x=241, y=7, generateEvent=1) at animcur.c:266
#5  0x0807a4bc in XineramaSetCursorPosition (pDev=0x85441c0, x=<value optimized out>, y=<value optimized out>, generateEvent=1) at events.c:555
#6  0x0807a72d in CheckPhysLimits (pDev=0x85441c0, cursor=<value optimized out>, generateEvents=1, confineToScreen=0, pScreen=0x0) at events.c:772
#7  0x0807a8e5 in XineramaConfineCursorToWindow (pDev=0x85441c0, pWin=0x83359e8, generateEvents=1) at events.c:650
#8  0x0807ab9e in NewCurrentScreen (pDev=0x85441c0, newScreen=0x82d1a10, x=241, y=7) at events.c:3167
#9  0x080dea32 in mieqProcessDeviceEvent (dev=0x85441c0, event=0x85795b8, screen=0x82d1a10) at mieq.c:388
#10 0x080deb7f in mieqProcessInputEvents () at mieq.c:471
#11 0x080b7cff in ProcessInputEvents () at xf86Events.c:165
#12 0x0809756a in Dispatch () at dispatch.c:407
#13 0x08064d8d in main () at main.c:285

I compared the code in 1.6.5 (which doesn't have this issue) against 1.7.3.902 (which does). It looks like the problem is due to a regression between XineramaCheckPhysLimits and CheckPhysLimits being merged into one single function:

== OLD (dix/events.c; XineramaCheckPhysLimits used for Xinerama)

    if((new.x != pSprite->hotPhys.x) || (new.y != pSprite->hotPhys.y))
    {
    ...
    }

== OLD (dix/events.c; CheckPhysLimits used for non-Xinerama)

    if ((pScreen != pSprite->hotPhys.pScreen) ||
        (new.x != pSprite->hotPhys.x) || (new.y != pSprite->hotPhys.y))
    {
    ...
    }

== NEW (dix/events.c; merged CheckPhysLimits used for Xinerama & non-Xinerama)

    if ((pScreen != pSprite->hotPhys.pScreen) ||
        (new.x != pSprite->hotPhys.x) || (new.y != pSprite->hotPhys.y))
    {
    ...
    }

This patch reverts back to the original behaviour (pScreen check should only do something for the non-Xinerama code path) and this fixes the problem for me... if you're CC'd on this bug can you please test this patch?

Cheers,

Tim
Comment 5 Julien Cristau 2009-12-29 11:24:59 UTC
the de-duplication of CheckPhysLimits was commit 942eae6868b8b0f343b6aa921ddf77e8bb70798a.  Assigning to Peter.
Comment 6 Jochen Keil 2009-12-30 13:07:29 UTC
Hi,

I built xorg-xserver-1.7.3-902 with the above patch applied.
X doesn't crash anymore but now it stays in "uninterruptible sleep", i.e. it's not usable/killable anymore.
Please tell me if you need additional information on this issue (like gdb or a bisect or so..)

Dec 30 21:41:01 monolith kernel: INFO: task Xorg:4594 blocked for more than 120 seconds.
Dec 30 21:41:01 monolith kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 30 21:41:01 monolith kernel: Xorg          D 0000000000000000     0  4594   4574 0x00400004
Dec 30 21:41:01 monolith kernel: ffff88014bf2f860 0000000000000086 0000000000000000 ffff88014e9b143f
Dec 30 21:41:01 monolith kernel: ffffffff814177f9 ffffffff814177f7 ffffffff81494880 ffff88014a990000
Dec 30 21:41:01 monolith kernel: ffff88014bf2fb00 000000000000f888 ffff88014bf2fb00 ffff88014a990000
Dec 30 21:41:01 monolith kernel: Call Trace:
Dec 30 21:41:01 monolith kernel: [<ffffffff8124ebd3>] ? vga_get+0x113/0x170
Dec 30 21:41:01 monolith kernel: [<ffffffff8104c290>] ? default_wake_function+0x0/0x20
Dec 30 21:41:01 monolith kernel: [<ffffffff8124ee60>] ? vga_arb_write+0x230/0x510
Dec 30 21:41:01 monolith kernel: [<ffffffff8110f9d8>] ? vfs_write+0xb8/0x1a0
Dec 30 21:41:01 monolith kernel: [<ffffffff8110fbae>] ? sys_write+0x4e/0x90
Dec 30 21:41:01 monolith kernel: [<ffffffff81012082>] ? system_call_fastpath+0x16/0x1b
Dec 30 21:43:01 monolith kernel: INFO: task Xorg:4594 blocked for more than 120 seconds.
Comment 7 Tim Yamin 2009-12-30 13:21:41 UTC
Jochen: hmm, that's a different issue. I'm running 2.6.31 so no VGA arbiter enabled here and I don't see this. See this URL for a patch which should resolve it for you: http://www.nvnews.net/vbulletin/showthread.php?t=142656
Comment 8 Linus Arver 2010-01-04 00:37:10 UTC
Hi, I've applied Tim's patch (on Arch Linux's xorg server 1.7.3.902-1 package, using Xinerama), but with even stranger results. I have two 1680x1050 widescreens.

My setup is:
                 S
                 c
                 r
Screen1          e
                 e
                 n
                 0


Screen0 is rotated to portrait mode. Because Screen0 is on the right (physically), my mouse cursor spawns there on boot. However, when I move my mouse over to Screen1, it wraps back around Screen0's rightmost edge.

So the patch does not work for dual heads with only one monitor rotated.

I disabled the rotation on Screen0, but the same wrapping behavior persists.
Comment 9 Christian Babeux 2010-02-09 18:52:46 UTC
(In reply to comment #8)
> Hi, I've applied Tim's patch (on Arch Linux's xorg server 1.7.3.902-1 package,
> using Xinerama), but with even stranger results. I have two 1680x1050
> widescreens.
>
> ...
>
> Screen0 is rotated to portrait mode. Because Screen0 is on the right
> (physically), my mouse cursor spawns there on boot. However, when I move my
> mouse over to Screen1, it wraps back around Screen0's rightmost edge.

I also applied Tim's patch on Arch Linux xorg-server 1.7.4.901 with no success.
I observed the same weird wrapping behavior. The 3 screens configured on my machine are as follow:

    Screen      0  "Screen0" RightOf "Screen1"
    Screen      1  "Screen1" 0 0
    Screen      2  "Screen2" RightOf "Screen0"

So simply moving the mouse on the third monitor make it warp to the first monitor (and not on the third monitor as expected...).


Comment 10 Christian Babeux 2010-02-09 19:52:52 UTC
After some thinkering, if I put the screens in ascending order in the xorg configuration, the patch seems to work properly!

I ordered my screens like this:
 
    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" RightOf "Screen0"
    Screen      2  "Screen2" RightOf "Screen1"

Comment 11 james meyer 2010-03-06 09:45:27 UTC
Applying this patch also worked for me.
As stated by Christian the screens need to be arranged in a specific order otherwise the mouse wraps around instead of moving to the next screen.
In my case I had to also flip the definitions of screen0 and screen1.


Section "ServerLayout"
    Identifier     "X.org Configured"
    Screen      0  "Screen0" 0 0 
    Screen      1  "Screen1" below "Screen0" 
    InputDevice    "Mouse0" "CorePointer"
    InputDevice    "Keyboard0" "CoreKeyboard"
    Option         "Xinerama" "1"
EndSection

Comment 12 Henning Schmiedehausen 2010-03-07 18:32:17 UTC
This bug is now open for ~3 months, it is a serious regression for many X users and no working bug fix is yet available. Wouldn't it be time to actually bump it up in priority a bit? Like in a lot? 
Comment 13 Peter Hutterer 2010-03-07 22:34:52 UTC
Created attachment 33851 [details] [review]
0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch

Tim, can I please have your signed-off-by for this patch (it's yours with another ifdef and a commit message). I want to get this patch upstream, but I can reproduce the issue even with the patch applied. So I don't think that's it just yet.
Comment 14 Peter Hutterer 2010-03-07 23:29:54 UTC
hmm. on more testing - looks like this line should just be removed, unconditionally. even if that's not the same code path as we had before. also, I just found out that my test method was broken (wacom was good to reproduce but it'd also screw with the screens). 
not sure about this yet, will update when I know more.
Comment 15 james meyer 2010-03-08 07:23:03 UTC
Peter,
Should we test 0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch 
or do you have nothing else in the works
-james
Comment 16 Peter Hutterer 2010-03-08 15:06:18 UTC
> Should we test 0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch 
> or do you have nothing else in the works

don't worry about testing it for now, I think it's still broken. thanks for
asking though.
Comment 17 Peter Hutterer 2010-03-09 21:13:06 UTC
Created attachment 33908 [details] [review]
0001-mi-ignore-fromDIX-argument-in-mieqSwitchScreen.patch

Please give this one a test. It seems to fix the problem for me though I'm not sure if there are hidden side-effects.
Comment 18 Peter Hutterer 2010-03-10 14:59:06 UTC
(In reply to comment #17)
> Created an attachment (id=33908) [details]
> 0001-mi-ignore-fromDIX-argument-in-mieqSwitchScreen.patch
> 
> Please give this one a test. It seems to fix the problem for me though I'm not
> sure if there are hidden side-effects.

sigh. no, this one isn't correct either and it looks like there's a race condition in the wacom driver caused by xf86XInputSetScreen that triggers more-or-less the same bug though through a different codepath.

So for now I'd go with Tim's patch (Tim - I still need your signed-off-by for the patch) and my reproducer needs to be fixed elsewhere.
Comment 19 Tim Yamin 2010-03-10 23:02:47 UTC
Created attachment 33939 [details] [review]
0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch

Understood -- here's the patch with the Signed-off-by added. I guess something is still not quite perfect as people are reporting that Screens need to be added in a particular order in your xorg.conf but at least this takes it back to the previous state of things.

Cheers,

Tim
Comment 20 Tim Yamin 2010-04-15 09:58:39 UTC
Peter,

Any chance of getting the current patch merged so it can at least make it into the next 1.8.x release?

Cheers,

Tim
Comment 21 Peter Hutterer 2010-04-15 18:31:27 UTC
commit 5f31e2196179f8db3170d65a17d8ad40da1acb0d
Author: Tim Yamin <plasm@roo.me.uk>
Date:   Mon Mar 8 12:45:15 2010 +1000

    dix: fix cursor screen check for xinerama setups.
Comment 22 Arnaud Fortier 2010-05-02 11:27:13 UTC
(In reply to comment #21)
> commit 5f31e2196179f8db3170d65a17d8ad40da1acb0d
> Author: Tim Yamin <plasm@roo.me.uk>
> Date:   Mon Mar 8 12:45:15 2010 +1000
> 
>     dix: fix cursor screen check for xinerama setups.

Is it working now?
I'm running:
X.Org X Server 1.8.0.901 (1.8.1 RC 1)
Release Date: 2010-04-27
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.33-ARCH x86_64 
Current Operating System: Linux myhost 2.6.33-ARCH #1 SMP PREEMPT Mon Apr 26 19:31:00 CEST 2010 x86_64

And no way to make my 3 monitor setup working (it lasts for ages, I had problems since 1.6.*).
Can you please solve this problem quick or revert to a version that worked?
Comment 23 Chris Humbert 2010-05-02 23:48:17 UTC
Created attachment 35383 [details] [review]
dix: make DeviceEvent coordinates signed for Xinerama

This patch along with Tim's committed patch makes my screens above and to the left of Screen 0 usable again. Please test.
Comment 24 Arnaud Fortier 2010-05-03 12:37:38 UTC
(In reply to comment #23)
> Created an attachment (id=35383) [details]
> dix: make DeviceEvent coordinates signed for Xinerama
> 
> This patch along with Tim's committed patch makes my screens above and to the
> left of Screen 0 usable again. Please test.

I have recompiled xorg-server with your patch and it works :)
I just applied the last one since the  0001-dix-fix-cursor-screen-check-for-xinerama-setups.patch was already present in xorg-server 1.8.0.901

Many thanks
Comment 25 Peter Hutterer 2010-05-13 17:43:43 UTC
Committed as 21ed660f30a3f96c787ab00a16499e0fb034b2ad. Thanks for the patch!