Created attachment 46416 [details]
Various logs hopefully helping to solve the issue
I'm using the Ubuntu 11.04 64-bit. It usually happens straight or shortly after login, and I have to log in again. Once it has crashed, it doesn't seem to do it again until next reboot. It's been discussed in Ubuntu Lauchpad in here: https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/768159 and here: https://bugs.launchpad.net/ubuntu/+source/gnome-session/+bug/763313. As not being Linux bug reporting expert, please forgive my mistakes. The instructions on the former bug report led me here.
From Xorg.0.log.old I can see:
[ 34.123] 0: /usr/bin/X (xorg_backtrace+0x26) [0x4a2626]
[ 34.123] 1: /usr/bin/X (0x400000+0x6219a) [0x46219a]
[ 34.123] 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fc6179c0000+0xfc60) [0x7fc6179cfc60]
[ 34.123] 3: /usr/lib/xorg/modules/extensions/librecord.so (0x7fc615375000+0x2920) [0x7fc615377920]
[ 34.123] 4: /usr/bin/X (_CallCallbacks+0x34) [0x432af4]
[ 34.123] 5: /usr/bin/X (WriteToClient+0x21a) [0x461c9a]
[ 34.123] 6: /usr/lib/xorg/modules/extensions/libdri2.so (ProcDRI2WaitMSCReply+0x52) [0x7fc614d5cd82]
[ 34.123] 7: /usr/lib/xorg/modules/extensions/libdri2.so (DRI2WaitMSCComplete+0x59) [0x7fc614d5b479]
[ 34.123] 8: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fc614b08000+0x25030) [0x7fc614b2d030]
[ 34.123] 9: /lib/x86_64-linux-gnu/libdrm.so.2 (drmHandleEvent+0x108) [0x7fc614f66478]
[ 34.123] 10: /usr/bin/X (WakeupHandler+0x4b) [0x4322fb]
[ 34.123] 11: /usr/bin/X (WaitForSomething+0x1b6) [0x45c786]
[ 34.123] 12: /usr/bin/X (0x400000+0x2e032) [0x42e032]
[ 34.123] 13: /usr/bin/X (0x400000+0x21a7e) [0x421a7e]
[ 34.123] 14: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xff) [0x7fc616909eff]
[ 34.123] 15: /usr/bin/X (0x400000+0x21629) [0x421629]
[ 34.124] Segmentation fault at address 0x7fc6185a2010
Caught signal 11 (Segmentation fault). Server aborting
Please consult the The X.Org Foundation support
RecordAReply(CallbackListPtr *pcbl, pointer nulldata, pointer calldata)
ReplyInfoRec *pri = (ReplyInfoRec *)calldata;
ClientPtr client = pri->client;
REQUEST(xReq); // <- gets stuff from freed pointer
majorop = stuff->reqType; // <- crash
Same in asm:
2820: 41 57 push %r15
2822: 41 56 push %r14
2824: 41 55 push %r13
2826: 49 89 d5 mov %rdx,%r13
2829: 41 54 push %r12
282b: 55 push %rbp
282c: 53 push %rbx
282d: 48 83 ec 28 sub $0x28,%rsp
2831: 4c 8b 3a mov (%rdx),%r15
// client = pri->client
2834: 49 8b 47 08 mov 0x8(%r15),%rax
// stuff = client->requestBuffer
2838: 44 0f b6 30 movzbl (%rax),%r14d
// majorop = stuff->reqType
client->requestBuffer doesn't hold any more original request because
reply is only send from WakeupHandler that is handling drmWaitVBlank
event for DRI2WaitMSC.
I don't know record&callback system enough to figure out how to fix the
crash quickly. It feels a bit like needing larger refactoring.
Is this a regression?
*** Bug 42475 has been marked as a duplicate of this bug. ***
(In reply to comment #2)
> Is this a regression?
No. It is old bug that was exposed by loosely related changes.
The bug is in record callback that happens to be hit by DRI2 because it is using IgnoreClient and later sending reply from WakeupHandler.
I guess simple fix for DRI2 caused crash would be changing WaitMSC to reset the current request. Later on wakeup handler would simple attend the client to handle same WaitMSC again.
But that would still leave record crashing for any other asynchronous reply.
Created attachment 53208 [details]
Failed attempt to write piglit test case for the crash
(In reply to comment #5)
> But that would still leave record crashing for any other asynchronous reply.
In my opinion option 1 from http://lists.x.org/archives/xorg-devel/2011-October/026017.html is the best way to fix this problem.
+ very easy it implement (just store op-codes and length in ClientRec and use them in RecordAReply)
+ protects all problematic requests automatically (RecordEnableContext, ListFontsWithInfo, DRI2WaitMSC, ...)
- ClientRec is amended with data that is only needed by Record extension
I have spent some time thinking of alternative approaches, but they are usually clumsy and require a lot more code fix the problem. I haven't provided a v2 patch yet, because I've been busy with other tasks lately.
Based on an email from Julien, this should now be fixed in master and server-1.11-branch with fb22a408c69a84f81905147de9e82cf66ffb6eb2