Transferred from xorg mailing list, so as not to be lost. May in fact be a duplicate of some of the server freezes that aren't caused by GPU lockup. Note Keith's suggested change is actually in the video drivers, but I've reported it here against the server because it potentially affects every video driver:
Fundamentally, a server compute loop triggered by running an OpenGL program such as Compiz, or glxgears, or fullscreen Flash, but happening sporadically.
I found the, or at least one cause of the, problem.
It is an unhandled select on the DRM FD, causing a loop around the select. It is stuck because the RegisterBlockAndWakeupHandler that was established in drmmode_pre_init of the Intel driver was lost. Under normal circumstances this will handle the ready DRM FD. I guess because of KMS, this is done early now. It is never redone. One would assume that the other drivers that are exhibiting the problem might have similar logic.
If there is a server reset, InitBlockAndWakeupHandlers is called inside the loop in server main, which will reinitialize the handler vector and loses the DRM handler.
As a temporary expedient I moved the InitBlockAndWakeupHandlers outside the loop. This makes the cases that were failing work. But this might cause the handler array to fill with handler descriptors for the same handler that is being reinstalled over and over, if such a thing were to happen.
It's really up to the server architects to decide how to fix this "properly". I might suggest that the Init stay outside the loop, and then Register be changed so that if the handler is already registered, it is a no-op. That seems like it would be the least fragile solution.
Looks like the call to AddGeneralSocket and
RegisterBlockAndWakeupHandlers just needs to wait until ScreenInit
happens. Should be easy to fix.
Sounds like an intel ddx bug to me.
Created attachment 37945 [details] [review]
Move registration to ScreenInit
No objections, so pushed.