Bug 25551 - X crashes trying to restore a bogus pointer when entering the VT with multiple screens
X crashes trying to restore a bogus pointer when entering the VT with multipl...
Status: NEW
Product: xorg
Classification: Unclassified
Component: Server/General
7.5 (2009.10)
Other All
: medium major
Assigned To: Xorg Project Team
Xorg Project Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-09 16:33 UTC by Aaron Plattner
Modified: 2010-06-15 07:21 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
log file (49.70 KB, text/x-log)
2009-12-09 16:33 UTC, Aaron Plattner
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Plattner 2009-12-09 16:33:31 UTC
Created attachment 31905 [details]
log file

I ran into this bug with Ubuntu 9.10 using xserver 1.6.4, but it also happens with server-1.7-branch from commit aea5ace1ee331fab0b72885ce0d5d3fc235e0708.

Start X in a multi-screen configuration.  For example, configure screen 0 with xf86-video-dummy and screen 1 with xf86-video-vesa.  Start X and then start gnome-session.  Move the mouse to the right screen and run gnome-terminal.  Run xlogo from inside the terminal, then VT switch away and back.  The server crashes with bogus pointer data:

==7666== Invalid read of size 8
==7666==    at 0x51DD9A: xf86CursorSetCursor (xf86Cursor.c:324)
==7666==    by 0x51E151: xf86CursorEnableDisableFBAccess (xf86Cursor.c:230)
==7666==    by 0x46A6EE: xf86Wakeup (xf86Events.c:546)
==7666==    by 0x44237A: WakeupHandler (dixutils.c:413)
==7666==    by 0x464D54: WaitForSomething (WaitFor.c:232)
==7666==    by 0x451FE1: Dispatch (dispatch.c:381)
==7666==    by 0x425E3B: main (main.c:285)
==7666==  Address 0xb3d5b40 is not stack'd, malloc'd or (recently) free'd
==7666== 
==7666== Invalid read of size 2
==7666==    at 0x51DD9E: xf86CursorSetCursor (xf86Cursor.c:324)
==7666==    by 0x51E151: xf86CursorEnableDisableFBAccess (xf86Cursor.c:230)
==7666==    by 0x46A6EE: xf86Wakeup (xf86Events.c:546)
==7666==    by 0x44237A: WakeupHandler (dixutils.c:413)
==7666==    by 0x464D54: WaitForSomething (WaitFor.c:232)
==7666==    by 0x451FE1: Dispatch (dispatch.c:381)
==7666==    by 0x425E3B: main (main.c:285)
==7666==  Address 0x18 is not stack'd, malloc'd or (recently) free'd
==7666== 

Backtrace:
0: /X/bin/Xorg (xorg_backtrace+0x28) [0x464498]
1: /X/bin/Xorg (0x400000+0x642f9) [0x4642f9]
2: /lib/libpthread.so.0 (0x5d21000+0xf190) [0x5d30190]
3: /X/bin/Xorg (0x400000+0x11dd9e) [0x51dd9e]
4: /X/bin/Xorg (0x400000+0x11e152) [0x51e152]
5: /X/bin/Xorg (xf86Wakeup+0x41f) [0x46a6ef]
6: /X/bin/Xorg (WakeupHandler+0x4b) [0x44237b]
7: /X/bin/Xorg (WaitForSomething+0x1d5) [0x464d55]
8: /X/bin/Xorg (0x400000+0x51fe2) [0x451fe2]
9: /X/bin/Xorg (0x400000+0x25e3c) [0x425e3c]
10: /lib/libc.so.6 (__libc_start_main+0xfd) [0x6bc1abd]
11: /X/bin/Xorg (0x400000+0x259e9) [0x4259e9]
Segmentation fault at address 0x18

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting


Program received signal SIGSEGV, Segmentation fault.
0x000000000051dd9e in xf86CursorSetCursor (pDev=0x925950, pScreen=0x7fadb0, pCurs=0x930980, x=<value optimized out>, y=310) at xf86Cursor.c:324
324		ScreenPriv->HotX = pCurs->bits->xhot;
(gdb) bt
#0  0x000000000051dd9e in xf86CursorSetCursor (pDev=0x925950, pScreen=0x7fadb0, pCurs=0x930980, x=<value optimized out>, y=310) at xf86Cursor.c:324
#1  0x000000000051e152 in xf86CursorEnableDisableFBAccess (index=0, enable=1) at xf86Cursor.c:230
#2  0x000000000046a6ef in xf86VTSwitch (blockData=<value optimized out>, err=<value optimized out>, pReadmask=<value optimized out>) at xf86Events.c:546
#3  xf86Wakeup (blockData=<value optimized out>, err=<value optimized out>, pReadmask=<value optimized out>) at xf86Events.c:291
#4  0x000000000044237b in WakeupHandler (result=-1, pReadmask=0x7d41c0) at dixutils.c:413
#5  0x0000000000464d55 in WaitForSomething (pClientsReady=<value optimized out>) at WaitFor.c:232
#6  0x0000000000451fe2 in Dispatch () at dispatch.c:381
#7  0x0000000000425e3c in main (argc=4, argv=0x7d2fb0, envp=<value optimized out>) at main.c:285
(gdb) p *pCurs
$1 = {bits = 0x8000800000000, foreRed = 0, foreGreen = 0, foreBlue = 0, backRed = 0, backGreen = 7, backBlue = 25, refcnt = 1704567, devPrivates = 0x21, 
  id = 0, serialNumber = 524294, name = 0}
(gdb) up
#1  0x000000000051e152 in xf86CursorEnableDisableFBAccess (index=0, enable=1) at xf86Cursor.c:230
230		xf86CursorSetCursor(pDev, pScreen, ScreenPriv->SavedCursor,
(gdb) p *ScreenPriv
$2 = {SWCursor = 1, isUp = 0, showTransparent = 0, HotX = 5, HotY = 0, x = 624, y = 310, CurrentCursor = 0x930980, CursorToRestore = 0x0, 
  CursorInfoPtr = 0x7fb8f0, CloseScreen = 0x54bb90 <miSpriteCloseScreen>, RecolorCursor = 0x45b890 <miRecolorCursor>, InstallColormap = 0, 
  QueryBestSize = 0x7ffff45fa940 <fbQueryBestSize>, spriteFuncs = 0x7c54e0, PalettedCursor = 0, pInstalledMap = 0x0, 
  SwitchMode = 0x7ffff4a0bbe0 <DUMMYSwitchMode>, EnableDisableFBAccess = 0x473c70 <xf86EnableDisableFBAccess>, SavedCursor = 0x930980, 
  ForceHWCursorCount = 0, HWCursorForced = 0, transparentData = 0x0}
Comment 1 Aaron Plattner 2009-12-09 16:48:22 UTC
This also occurs with master.
Comment 2 IanElliott 2010-06-14 07:01:18 UTC
Based on the description, this seems like the bug that I chased down recently.  The problem I found is that when xf86CursorSetCursor() is given the NullCursor as a parameter, it doesn't set ScreenPriv->CurrentCursor to pCurs/NullCursor.  I'll give more details.

The "xf86Cursor.c" code uses a cached pointer to the current cursor for a given screen, ScreenPriv->CurrentCursor.  When xf86CursorSetCursor() is given a normal cursor, it updates its pointer.  When the pointer is moved from one screen to another (at least with Xinerama enabled), xf86CursorSetCursor() is called with NullPointer for the old screen, to turn off the cursor for that old screen.  In this case however, xf86CursorSetCursor() doesn't update ScreenPriv->CurrentCursor (to NullCursor).  As a result, the pointer continues to point to the cursor that used to be the current cursor, but may be deleted (and whose memory may be reused before the cursor code is again called for that screen).  When the VT switches occur, the cursor code will eventually try to access the private pointers for the old cursor.

The way that I duplicated this defect was to move a window from one screen to another (with Xinerama enabled on Fedora 11, with Gnome), and then to immediately do a VT switch to the console, and back again to X11.  This results in a segmentation violation when the cursor code tries to access ScreenPriv->CurrentCursor in ways that are no longer valid with the memory that it's pointing to.  For me, the offending code was dixLookupPrivate(), which didn't find the requested private, and tried to reallocate memory.

Looks like the solution is to simply add a line of code to xf86CursorSetCursor() to set ScreenPriv->CurrentCursor to NullCursor inside of the "if (pCurs == NullCursor) {" code that turns off the cursor.
Comment 3 Michel Dänzer 2010-06-15 07:21:07 UTC
(In reply to comment #2)
> The "xf86Cursor.c" code uses a cached pointer to the current cursor for a given
> screen, ScreenPriv->CurrentCursor.  When xf86CursorSetCursor() is given a
> normal cursor, it updates its pointer.  When the pointer is moved from one
> screen to another (at least with Xinerama enabled), xf86CursorSetCursor() is
> called with NullPointer for the old screen, to turn off the cursor for that old
> screen.  In this case however, xf86CursorSetCursor() doesn't update
> ScreenPriv->CurrentCursor (to NullCursor).  As a result, the pointer continues
> to point to the cursor that used to be the current cursor, but may be deleted
> (and whose memory may be reused before the cursor code is again called for that
> screen).

I don't think that by itself can explain the problem since commit 67a8c659f25218904bae64aac6e98e326c90330b ('hw/xfree86: move reference counting out of the UseHWCursor[ARGB] functions'), as the cursor reference count is always increased before assigning ScreenPriv->CurrentCursor. I suspect there may be a cursor reference counting bug elsewhere, causing the cursor's reference count to go to 0 prematurely. Maybe you can figure out where this is happening with appropriate gdb breakpoints and/or debugging output or so.