Summary: | "glresize" causes server segfault with single buffering. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Nick Bowler <nbowler> | ||||||
Component: | Driver/intel | Assignee: | Kristian Høgsberg <krh> | ||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | chris | ||||||
Version: | unspecified | ||||||||
Hardware: | Other | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
Nick Bowler
2010-06-02 11:57:03 UTC
* renames glresize to crashme. Not seeing a GPU hang from simply running glresize --single. Are you sure that it was the trigger, and not another application? Created attachment 36022 [details]
Full server log, crash #2
OK, I'm less sure about what the problem is (or was) now. I can't reproduce
some of the things I remember seeing at all anymore. However, it seems like
the hang which caused the original log is actually my fault by accidentally
running with the wrong mesa. The bonus is that your commit, 6db1e523 ("dri:
Protect against NULL dereference following GPU hang."), has fixed this segfault
anyway.
But while we're at it, I now can produce a different segfault, this time I'm
*definitely* using the right mesa git master, by just repeatedly running and
ctrl+C'ing glresize enough times (when it finally goes, X crashes the moment
I press ctrl+C). Occurs with both server 1.8.1 and git master...
Backtrace:
[ 653.281] 0: /usr/bin/X (xorg_backtrace+0x28) [0x4675e8]
[ 653.282] 1: /usr/bin/X (0x400000+0x67549) [0x467549]
[ 653.282] 2: /lib/libpthread.so.0 (0x7fc0d9f6e000+0xedf0) [0x7fc0d9f7cdf0]
[ 653.282] 3: /usr/bin/X (0x400000+0x5c6ac) [0x45c6ac]
[ 653.282] 4: /usr/bin/X (LocalClient+0x2d) [0x46848d]
[ 653.282] 5: /usr/lib/xorg/modules/extensions/libdri2.so (0x7fc0d735d000+0x3@
[ 653.282] 6: /usr/bin/X (0x400000+0x522c9) [0x4522c9]
[ 653.282] 7: /usr/bin/X (0x400000+0x24bf5) [0x424bf5]
[ 653.282] 8: /lib/libc.so.6 (__libc_start_main+0xe6) [0x7fc0d8f05a26]
[ 653.283] 9: /usr/bin/X (0x400000+0x247b9) [0x4247b9]
[ 653.283] Segmentation fault at address 0x28
[ 653.283]
Fatal server error:
[ 653.283] Caught signal 11 (Segmentation fault). Server aborting
Perhaps more useful than the X log backtrace, here's the trace from the core dump, featuring actual debugging symbols. Taken from latest X git. The fault occurs because ciptr (which is 0) is dereferenced. [snip] #8 <signal handler called> #9 0x00000000004c523b in _XSERVTransGetPeerAddr (ciptr=0x0, familyp=0x7fff222d9294, addrlenp=0x7fff222d9298, addrp=0x7fff222d9288) at /usr/include/X11/Xtrans/Xtrans.c:987 #10 0x0000000000482032 in LocalClient (client=0x3bbcad0) at access.c:1126 #11 0x00007fa498b1ce62 in ProcDRI2Dispatch (client=0x3bbcad0) at dri2ext.c:559 #12 0x000000000042d0aa in Dispatch () at dispatch.c:432 #13 0x0000000000424ca6 in main (argc=3, argv=0x7fff222d9458, envp=0x7fff222d9478) at main.c:283 Hmm, this looks like another racy termination condition. I suspect that this is sufficient to fixup this instance: diff --git a/os/access.c b/os/access.c index 36e1b81..ed20e07 100644 --- a/os/access.c +++ b/os/access.c @@ -1123,6 +1123,9 @@ Bool LocalClient(ClientPtr client) pointer addr; register HOST *host; + if (client->clientGone) + return FALSE; + if (!_XSERVTransGetPeerAddr (((OsCommPtr)client->osPrivate)->trans_conn, ¬used, &alen, &from)) { Nick, can you try this and if happens again p *client. Kristian, smells like more dri2 fun, over to you. ;-) I applied that patch on top of xserver git master, and the server still crashes in exactly the same place with an identical trace (modulo line number changes). In case it's helpful, here's the client structure at the call site of _XSERVTransGetPeerAddr (frame 10 in the backtrace). Note that clientGone is zero. (gdb) print *client $1 = {index = 9, clientAsMask = 18874368, requestBuffer = 0x2d7a2d4, osPrivate = 0x2bd12b0, swapped = 0, pSwapReplyFunc = 0, errorValue = 18874370, sequence = 43, closeDownMode = 0, clientGone = 0, noClientException = -1, saveSet = 0x0, numSaved = 0, requestVector = 0x862f80, req_len = 5, big_requests = 1, priority = 0, clientState = ClientStateRunning, devPrivates = 0x2cf2eb0, xkbClientFlags = 32768, mapNotifyMask = 0, newKeyboardNotifyMask = 0, vMajor = 1, vMinor = 0, minKC = 8 '\b', maxKC = 255 '\377', replyBytesRemaining = 0, smart_priority = 0, smart_start_tick = 3920, smart_stop_tick = 3920, smart_check_tick = 3920, clientPtr = 0x2a7e480} Also, here's the osPrivate structure, of which the the trans_conn member is passed to _XSERVTransGetPeerAddr. (gdb) print *(OsCommPtr)client->osPrivate $4 = {fd = 22, input = 0x2cf4e30, output = 0x2ca97b0, auth_id = 0, conn_time = 0, trans_conn = 0x0} I think the second crash is fixed by xserver commit 660f6ab5494a72 ("Don't crash when asked if a client that has disconnected was local"), so I'm closing this. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.