Bug 29584 - Server in compute loop
Server in compute loop
Status: RESOLVED FIXED
Product: xorg
Classification: Unclassified
Component: Driver/intel
7.5 (2009.10)
x86 (IA32) Linux (All)
: medium normal
Assigned To: Chris Wilson
Xorg Project Team
:
Depends on:
Blocks: xserver-1.9
  Show dependency treegraph
 
Reported: 2010-08-15 08:32 UTC by Marty Jack
Modified: 2010-08-20 13:03 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Move registration to ScreenInit (3.18 KB, patch)
2010-08-18 02:29 UTC, Chris Wilson
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Marty Jack 2010-08-15 08:32:13 UTC
Transferred from xorg mailing list, so as not to be lost.  May in fact be a duplicate of some of the server freezes that aren't caused by GPU lockup.  Note Keith's suggested change is actually in the video drivers, but I've reported it here against the server because it potentially affects every video driver:

Fundamentally, a server compute loop triggered by running an OpenGL program such as Compiz, or glxgears, or fullscreen Flash, but happening sporadically.

I found the, or at least one cause of the, problem.

It is an unhandled select on the DRM FD, causing a loop around the select.  It is stuck because the RegisterBlockAndWakeupHandler that was established in drmmode_pre_init of the Intel driver was lost.  Under normal circumstances this will handle the ready DRM FD.  I guess because of KMS, this is done early now.  It is never redone.  One would assume that the other drivers that are exhibiting the problem might have similar logic.

If there is a server reset, InitBlockAndWakeupHandlers is called inside the loop in server main, which will reinitialize the handler vector and loses the DRM handler.

As a temporary expedient I moved the InitBlockAndWakeupHandlers outside the loop.  This makes the cases that were failing work.  But this might cause the handler array to fill with handler descriptors for the same handler that is being reinstalled over and over, if such a thing were to happen.

It's really up to the server architects to decide how to fix this "properly".  I might suggest that the Init stay outside the loop, and then Register be changed so that if the handler is already registered, it is a no-op.  That seems like it would be the least fragile solution.

Keith response:

Looks like the call to AddGeneralSocket and
RegisterBlockAndWakeupHandlers just needs to wait until ScreenInit
happens. Should be easy to fix.
Comment 1 ajax at nwnk dot net 2010-08-17 11:31:48 UTC
Sounds like an intel ddx bug to me.
Comment 2 Chris Wilson 2010-08-18 02:29:37 UTC
Created attachment 37945 [details] [review]
Move registration to ScreenInit
Comment 3 Chris Wilson 2010-08-19 12:09:16 UTC
No objections, so pushed.