|Summary:||Xv + Silkenmouse = Lethal race condition|
|Product:||xorg||Reporter:||Thomas Winischhofer <thomas>|
|Component:||Server/DDX/Xorg||Assignee:||Xorg Project Team <xorg-team>|
|Status:||RESOLVED FIXED||QA Contact:||Xorg Project Team <xorg-team>|
|i915 platform:||i915 features:|
|Bug Depends on:|
Description Thomas Winischhofer 2005-09-30 12:27:30 UTC
[From my mail to the xorg list, slightly edited] Silkenmouse means that the cursor update code (positioning, view port panning) is executed asynchronously, ie it can interrupt on-going other operations of the server. xf86PointerMoved (pScrn->PointerMoved) is there to eventually pan the viewport. So it is likely that pScrn->AdjustFrame is executed, which is wrapped by Xv. Xv's wrapper for AdjustFrame, if the driver has not installed a ReputImage function, stops the video (by calling the driver's stopvideo routine) and removes the port from the window (which is done in RemovePortFromWindow()). This involves NULLifying the pointer to the drawable in the Xv private structure. However, XVAdjustFrame does so only if the video is currently on, indicated by pPriv->IsOn == XV_ON. This is correct. But what is not correct is that XvAdjustFrame(), before calling RemovePortFromWindow, does not check if the pPriv->Draw is set. So, the condition under which XVAdjustFrame calls RemovePortFromWindow is only that pPriv->IsOn = XV_ON. PutImage, on the other hand, removes the port from the window every time it is called. BUT: It does not set pPriv->IsOn to XV_OFF or XV_PENDING (whatever would be correct, haven't thought about this yet) at this point. PutImage removes the port from the window (ie NULLifies the pointer to the drawable), and proceeds doing its business, without touching pPriv->IsOn until the very end, after the driver's PutImage() has been called. During the time between the RemovePortFromWindow() call and the near end of PutImage the situation is that - the portPriv->pDraw is NULL - pPriv->isOn is eventually XV_ON from a previous call to PutImage(). If the "silken mouse", ie AdjustFrame interrupts PutImage during this time, it will see IsOn == XV_ON and therefore eventually call RemovePortFromWindow with a NULL pDraw (= pWin). RemovePortFromWindow does no NULL-check on its pWin argument, so X gets a sig 11. This discovery frightens me. Sure, the easy way is to check for a NULL pointer in XVAdjustFrame, or - perhaps better - to add a pPriv->XV_xxx after or before PutImage's call to RemovePortFromWindow. But this is only a work-around for a more deeper problem. Mildly put, I am not sure that letting the driver's stopvideo() interrupt an on-going PutImage() is safe in any case. An easy and quick fix would be inserting portPriv->IsOn = XV_OFF; before the call to xf86XVRemovePortFromWindow() in xf86XVPutImage. This could also prevent a stopvideo interrupting an on-going Putimage(). But this assumingly makes StopVideo choke, and I am hardly convinced that this is a long-term solution. The long term solution is to make AdjustFrame be executed in the normal event queue (as also Keith suggested). It does way too much nowadays to safely assume it won't interphere with normal server operations. I solved this by deferring AdjustFrame into the block handler in the sis driver (not yet in CVS, and won't in time for 6.9/7.0), but this really needs a global fix.
Comment 1 Daniel Stone 2007-02-27 01:28:15 UTC
Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 2 Daniel Stone 2009-08-31 18:08:57 UTC
ugh. deferring to 7.6, but i'd be shocked to see it even make that.
Comment 3 Adam Jackson 2010-08-09 11:02:30 UTC
Created attachment 37738 [details] [review] 4652.patch Untested, but probably works. Basically all of the xf86XVAdjustFrame work needs to move to BlockHandler since we're going to call free().
Comment 4 Jeremy Huddleston Sequoia 2011-04-11 14:05:41 UTC
ajax, what is the status of this? You came up with a patch back in October... has anyone been living on it? Even still, this bug has been around for many releases and does not seem to have a large impact on users. Because of this, it does not meet the requirements for inclusion in the stable branch (1.10). Punting to 1.11.
Comment 5 Jeremy Huddleston Sequoia 2011-09-24 16:20:00 UTC
See http://lists.freedesktop.org/archives/xorg/2011-September/053561.html We think that this should be fixed with recent changes. Please test xorg-server-1.11.1 and reopen if there are remaining issues.