Bug 12906 - reproducible crash in _mesa_set_viewport
Summary: reproducible crash in _mesa_set_viewport
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: unspecified
Hardware: x86 (IA32) All
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-23 20:47 UTC by Deomid Ryabkov
Modified: 2007-11-01 07:13 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
patch to matrix.c (1.23 KB, patch)
2007-10-30 10:00 UTC, Brian Paul
Details | Splinter Review
experimental patch to context.c (472 bytes, patch)
2007-10-31 07:29 UTC, Brian Paul
Details | Splinter Review
actual patch that was applied. (426 bytes, patch)
2007-10-31 10:42 UTC, Deomid Ryabkov
Details | Splinter Review

Description Deomid Ryabkov 2007-10-23 20:47:48 UTC
Xorg 7.3 / VESA on FreeBSD 6.2-STABLE crashes with gnash [GNU flash replacement] Firefox plugin installed when visiting a particular page.
In addition to that, there's a double-fault in XkbEnableDisableControls that happens in the SIGSEGV handler path and actually causes Xorg to loop forever.
After fixing it (with a small patch that adds check for xkbi->desc for NULL and returns False), Xorg at least started to crash and dump core properly.

Now, on to the actual cause of crash. Turns out, it's a NULL dereference in _mesa_set_viewport when accessing ctx->DrawBuffer.

I have no idea why this happens, so this is where i'd like to hand this over to someone more knowledgeable.
Attached is the archive of an html page and 2 flash movies that cause Xorg to crash in 100% cases on my machine.
If you need additional information and cannot reproduce the crash, I'm willing to assist.

Logs of GDB sessions showing both crash in XkbEnableDisableControls and in _mesa_set_viewport are available here: http://www.rojer.pp.ru/misc/gnash_crash/backtraces.txt

I have reduced the page that causes the crash to just 2 lines with 2 flash movies - see archive here: http://www.rojer.pp.ru/misc/gnash_crash/gnash_crash1.tar.bz2

Thanks!
Comment 1 Brice Goglin 2007-10-27 13:45:07 UTC
Marcus Better reported a similar crash on Linux in
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=443794
The bug is apparently in GLcore, so I am reassigning to Mesa.

He got the crash while loading http://www.di.se in Firefox with DRI disabled. Here are 2 corresponding backtraces:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2adb175b5240 (LWP 6629)]
0x00002adb29c1c27e in _mesa_set_viewport (ctx=0x225c3e0, x=0, y=0, width=140, height=175) at matrix.c:597
(gdb) bt full
#0  0x00002adb29c1c27e in _mesa_set_viewport (ctx=0x225c3e0, x=0, y=0, width=140, height=175) at matrix.c:597
No locals.
#1  0x00002adb181e892c in DoRender (cl=<value optimized out>, pc=0x3073ab8 "\024", do_swap=0) at ../../../GL/glx/glxcmds.c:1851
        entry = {bytes = 20, varsize = 0}
        extra = 0
        proc = (__GLXdispatchRenderProcPtr) 0x2adb181ef570 <__glXDisp_Viewport>
        err = 0
        client = (ClientPtr) 0x1534490
        left = 15328
        cmdlen = 20
        error = 0
        commandsDone = 0
        glxc = (__GLXcontext *) 0x225c300
        sw = <value optimized out>
#2  0x00002adb181ec72c in __glXDispatch (client=0x1534490) at ../../../GL/glx/glxext.c:561
        stuff = (xGLXSingleReq *) 0x3073ab0
        opcode = <value optimized out>
        proc = (__GLXdispatchSingleProcPtr) 0x2adb181e8a40 <__glXDisp_Render>
        cl = (__GLXclientState *) 0x21d6310
        retval = 1
#3  0x000000000044e210 in Dispatch () at ../../dix/dispatch.c:502
        clientReady = <value optimized out>
        result = <value optimized out>
        client = (ClientPtr) 0x1534490
        nready = 0
        start_tick = 26000 #4  0x0000000000436a1c in main (argc=8, argv=0x7fff953a86e8, envp=<value optimized out>) at ../../dix/main.c:452
        pScreen = <value optimized out>
        i = 1
        error = 0
        xauthfile = <value optimized out>
        alwaysCheckForInput = {0, 1}




Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2ab972f48240 (LWP 7185)]
_mesa_PushAttrib (mask=1048575) at attrib.c:306
(gdb) bt full
#0  _mesa_PushAttrib (mask=1048575) at attrib.c:306
        attr = <value optimized out>
        newnode = <value optimized out>
        head = (struct gl_attrib_node *) 0x2957d30
        ctx = (GLcontext *) 0x17611d0
#1  0x00002ab973b7b92c in DoRender (cl=<value optimized out>, pc=0xb61e7c "\b", do_swap=0)
    at ../../../GL/glx/glxcmds.c:1851
        entry = {bytes = 8, varsize = 0}
        extra = 0
        proc = (__GLXdispatchRenderProcPtr) 0x2ab973b82060 <__glXDisp_PushAttrib>
        err = 0
        client = (ClientPtr) 0x16bf910
        left = 0
        cmdlen = 8
        error = 32767
        commandsDone = 9
        glxc = (__GLXcontext *) 0x17610f0
        sw = <value optimized out>
#2  0x00002ab973b7f72c in __glXDispatch (client=0x16bf910) at ../../../GL/glx/glxext.c:561
        stuff = (xGLXSingleReq *) 0xb61dfc
        opcode = <value optimized out>
        proc = (__GLXdispatchSingleProcPtr) 0x2ab973b7ba40 <__glXDisp_Render>
        cl = (__GLXclientState *) 0xb11e20
        retval = 1
#3  0x000000000044e210 in Dispatch () at ../../dix/dispatch.c:502
        clientReady = <value optimized out>
        result = <value optimized out>
        client = (ClientPtr) 0x16bf910
        nready = 0
        start_tick = 28140
#4  0x0000000000436a1c in main (argc=8, argv=0x7fff39a15718, envp=<value optimized out>)
    at ../../dix/main.c:452
        pScreen = <value optimized out>
        i = 1
        error = 0
        xauthfile = <value optimized out>
        alwaysCheckForInput = {0, 1}


Comment 2 Brian Paul 2007-10-30 10:00:47 UTC
Created attachment 12263 [details] [review]
patch to matrix.c

Can you try this patch?
Comment 3 Deomid Ryabkov 2007-10-30 19:01:33 UTC
there's a progress: now it's crashing in a different place, this time dereferencing ReadBuffer, which is also NULL:

Program received signal SIGSEGV, Segmentation fault.
0x3079d4f3 in _mesa_PushAttrib (mask=1048575) at attrib.c:280
280           attr->ReadBuffer = ctx->ReadBuffer->ColorReadBuffer;
(gdb) bt
#0  0x3079d4f3 in _mesa_PushAttrib (mask=1048575) at attrib.c:280
#1  0x28648438 in __glXDisp_PushAttrib (pc=0x0) at indirect_dispatch.c:1450
#2  0x28641849 in DoRender (cl=0x0, pc=0x87390bc "\b", do_swap=0) at glxcmds.c:1851
#3  0x2864197a in __glXDisp_Render (cl=0x0, pc=0x0) at glxcmds.c:1865
#4  0x28644938 in __glXDispatch (client=0x8691200) at glxext.c:561
#5  0x0814656d in XaceCatchExtProc (client=0x8691200) at xace.c:299
#6  0x08087024 in Dispatch () at dispatch.c:502
#7  0x0806df1c in main (argc=7, argv=0xbfbfeebc, envp=0x0) at main.c:452
(gdb) print ctx
$1 = (GLcontext *) 0x8783000
(gdb) print ctx->ReadBuffer
$2 = (GLframebuffer *) 0x0
Comment 4 Brian Paul 2007-10-31 07:29:06 UTC
Created attachment 12273 [details] [review]
experimental patch to context.c

There's lots of places where we dereference ctx->Draw/ReadBuffer without checking if it's null.  The assumption being that the rendering context is bound to a drawable.  Let's put a test in _mesa_make_current() to see if it's being called with a context, but no drawable.

Please apply this patch and get a stack trace if/when the assertion fails.
Comment 5 Deomid Ryabkov 2007-10-31 10:42:27 UTC
Created attachment 12276 [details] [review]
actual patch that was applied.

i think we're looking at slightly different source trees, because the patch to context.c did not apply cleanly, but i found the _mesa_make_current function and added the assertions.

well, it didn't crash on any of these, but probably because asserts get undef'ed or something.
i did actually attach gdb to running server, set breakpoint on _mesa_make_current and here's what i got:

(gdb) break _mesa_make_current
Breakpoint 1 at 0x307a7437: file context.c, line 1440.
(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x8809000, drawBuffer=0x9095000, readBuffer=0x9095000) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);
(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x90f0000, drawBuffer=0x968d000, readBuffer=0x968d000) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);
(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x8809000, drawBuffer=0x0, readBuffer=0x0) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);
(gdb) bt
#0  _mesa_make_current (newCtx=0x8809000, drawBuffer=0x0, readBuffer=0x0) at context.c:1440
#1  0x308ef65e in XMesaForceCurrent (c=0x8809000) at xm_api.c:1879
#2  0x28644d40 in __glXMesaContextForceCurrent (baseContext=0x0) at glxglcore.c:209
#3  0x286446b6 in __glXForceCurrent (cl=0x873b240, tag=677967912, error=0xbfbfe904) at glxext.c:408
#4  0x286643d1 in DoGetString (cl=0x873b240, pc=0x867f000 "\220\201\003", need_swap=0 '\0') at single2.c:334
#5  0x286645da in __glXDisp_GetString (cl=0x0, pc=0x0) at single2.c:393
#6  0x28644938 in __glXDispatch (client=0x87f1e00) at glxext.c:561
#7  0x0814656d in XaceCatchExtProc (client=0x87f1e00) at xace.c:299
#8  0x08087024 in Dispatch () at dispatch.c:502
#9  0x0806df1c in main (argc=7, argv=0xbfbfeebc, envp=0x0) at main.c:452
(gdb) frame 6
#6  0x28644938 in __glXDispatch (client=0x87f1e00) at glxext.c:561
561             retval = (*proc)(cl, (GLbyte *) stuff);
(gdb) print *client
$1 = {index = 16, clientAsMask = 33554432, requestBuffer = 0x867f000, osPrivate = 0x8786b60, swapped = 0, pSwapReplyFunc = 0x81ae100 <WriteToClient>, errorValue = 33554495,
  sequence = 365, closeDownMode = 0, clientGone = 0, noClientException = 0, lastDrawable = 0x878ca00, lastDrawableID = 33554495, lastGC = 0x0, lastGCID = 0, saveSet = 0x0, numSaved = 0,
  screenPrivate = {0x80015, 0x8000c, 0x90015, 0x9000b, 0xa0015, 0xa000a, 0xb0015, 0xb0009, 0xc0015, 0xc0008, 0xd0015, 0xd0007, 0xe0015, 0xe0006, 0xf0015, 0xf0005},
  requestVector = 0x81fcec0, req_len = 3, big_requests = 1, priority = 0, clientState = ClientStateRunning, devPrivates = 0x87f1ec4, xkbClientFlags = 32771, mapNotifyMask = 255,
  newKeyboardNotifyMask = 65535, vMajor = 1, vMinor = 0, minKC = 8 '\b', maxKC = 255 'ΓΏ', replyBytesRemaining = 0, appgroup = 0x0, fontResFunc = 0, smart_priority = 0,
  smart_start_tick = 67620, smart_stop_tick = 67500, smart_check_tick = 67620}
(gdb) print cl
$2 = (__GLXclientState *) 0x873b240
(gdb) print *cl
$3 = {inUse = 1, returnBuf = 0x0, returnBufSize = 0, largeCmdBytesSoFar = 0, largeCmdBytesTotal = 0, largeCmdRequestsSoFar = 0, largeCmdRequestsTotal = 0, largeCmdBuf = 0x0,
  largeCmdBufSize = 0, currentContexts = 0x87efa20, numCurrentContexts = 1, client = 0x87f1e00, GLClientmajorVersion = 1, GLClientminorVersion = 4,
  GLClientextensions = 0x8782000 "GL_ARB_depth_texture GL_ARB_draw_buffers GL_ARB_fragment_program GL_ARB_fragment_program_shadow GL_ARB_imaging GL_ARB_multisample GL_ARB_multitexture GL_ARB_occlusion_query GL_ARB_point_parameters GL_"...}
(gdb) frame 5
#5  0x286645da in __glXDisp_GetString (cl=0x0, pc=0x0) at single2.c:393
393         return DoGetString(cl, pc, GL_FALSE);
(gdb) print cl
$4 = (__GLXclientState *) 0x0
(gdb) print pc
$5 = (GLbyte *) 0x0


i don't know what to make of it, put as you see i poked around some pointers and printed bunch of structues. hope it gives you and idea of what's going on.
Comment 6 Brian Paul 2007-10-31 12:37:42 UTC
OK, that info helps a bit, but I think I need a little more.

In _mesa_make_current(), could you add a line to print the params for every call, like this:

  ErrorF("_mesa_make_current %p %p %p\n", newCtx, drawBuffer, readBuffer);

The message should either go to stderr or your Xorg.0.log file.

Comment 7 Deomid Ryabkov 2007-10-31 12:43:56 UTC
but you already have this info, here it is:

Breakpoint 1, _mesa_make_current (newCtx=0x8809000, drawBuffer=0x9095000,
readBuffer=0x9095000) at context.c:1440
Breakpoint 1, _mesa_make_current (newCtx=0x90f0000, drawBuffer=0x968d000,
readBuffer=0x968d000) at context.c:1440
Breakpoint 1, _mesa_make_current (newCtx=0x8809000, drawBuffer=0x0,
readBuffer=0x0) at context.c:1440

i point my browser at the test page, the flash movies start to load, their windows are filled with black and at this point the first breakpoint is hit. then, without any visible changes, two more invocations of _mesa_make_current follow, the second one with both drawBuffer and readBuffer set to NULL. then X server crashes.
Comment 8 Brian Paul 2007-10-31 14:52:47 UTC
OK, I didn't realize there were only 3 calls to _mesa_make_current().

After the crash, can you go up to XMesaForceCurrent() and print c->mesa.WinSysDrawBuffer and c->mesa.WinSysReadBuffer?

If those are NULL, the question is when do those values get cleared between the first and third _mesa_make_current() calls?  Setting a breakpoint in _mesa_unreference_framebuffer() might answer that.

Thanks for debugging this, BTW...
Comment 9 Deomid Ryabkov 2007-10-31 16:58:13 UTC
c->mesa.WinSysDrawBuffer and c->mesa.WinSysReadBuffer are indeed NULL.
and they are unref'd from within _mesa_make_current itself, in this fragment:

  if (oldCtx) {
      _mesa_unreference_framebuffer(&oldCtx->WinSysDrawBuffer);
      _mesa_unreference_framebuffer(&oldCtx->WinSysReadBuffer);
      _mesa_unreference_framebuffer(&oldCtx->DrawBuffer);
      _mesa_unreference_framebuffer(&oldCtx->ReadBuffer);
   }

here's the debugging session:

(gdb) break _mesa_make_current
Breakpoint 1 at 0x307a7437: file context.c, line 1440.
(gdb) break _mesa_unreference_framebuffer
Breakpoint 2 at 0x307cb3a7: file framebuffer.c, line 244.
(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x87e2000, drawBuffer=0x9606000, readBuffer=0x9606000) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);

[this is first call to _mesa_make_current with newCtx=0x87e2000]

(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x906c000, drawBuffer=0x9665000, readBuffer=0x9665000) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);

[this is second call to _mesa_make_current, this time with newCtx=0x906c000]

(gdb) cont
Continuing.

Breakpoint 2, _mesa_unreference_framebuffer (fb=0x87e2000) at framebuffer.c:244
244     {
(gdb) bt
#0  _mesa_unreference_framebuffer (fb=0x87e2000) at framebuffer.c:244
#1  0x307a74b6 in _mesa_make_current (newCtx=0x906c000, drawBuffer=0x9665000, readBuffer=0x9665000) at context.c:1467
#2  0x308ef457 in XMesaMakeCurrent2 (c=0x906c000, drawBuffer=0x9665000, readBuffer=0x9665000) at xm_api.c:1789
#3  0x28644ca4 in __glXMesaContextMakeCurrent (baseContext=0x906c000) at glxglcore.c:178
#4  0x2863fc27 in DoMakeCurrent (cl=0x87b5980, drawId=33554495, readId=33554495, contextId=33554496, tag=0) at glxcmds.c:650
#5  0x2863fe69 in __glXDisp_MakeCurrent (cl=0x87e20e8, pc=0x906c000 "") at glxcmds.c:395
#6  0x28644938 in __glXDispatch (client=0x87bf600) at glxext.c:561
#7  0x0814656d in XaceCatchExtProc (client=0x87bf600) at xace.c:299
#8  0x08087024 in Dispatch () at dispatch.c:502
#9  0x0806df1c in main (argc=7, argv=0xbfbfeebc, envp=0x87e20e8) at main.c:452
(gdb) frame 1
#1  0x307a74e0 in _mesa_make_current (newCtx=0x906c000, drawBuffer=0x9665000, readBuffer=0x9665000) at context.c:1470
1470          _mesa_unreference_framebuffer(&oldCtx->ReadBuffer);
(gdb) print oldCtx
$1 = (GLcontext *) 0x87e2000
(gdb) print &oldCtx->ReadBuffer
$2 = (GLframebuffer **) 0x87e20e4
(gdb) frame 0
#0  _mesa_unreference_framebuffer (fb=0x87e2000) at framebuffer.c:244
244     {
(gdb) step
246        if (*fb) {
(gdb) print *fb
$9 = (struct gl_framebuffer *) 0x9606000
(gdb) print fb
$10 = (struct gl_framebuffer **) 0x87e20e4
(gdb) cont
Continuing.

Breakpoint 1, _mesa_make_current (newCtx=0x87e2000, drawBuffer=0x0, readBuffer=0x0) at context.c:1440
1440       GET_CURRENT_CONTEXT(oldCtx);
(gdb) bt
#0  _mesa_make_current (newCtx=0x87e2000, drawBuffer=0x0, readBuffer=0x0) at context.c:1440
#1  0x308ef65e in XMesaForceCurrent (c=0x87e2000) at xm_api.c:1879
#2  0x28644d40 in __glXMesaContextForceCurrent (baseContext=0x0) at glxglcore.c:209
#3  0x286446b6 in __glXForceCurrent (cl=0x87b5a80, tag=677967912, error=0xbfbfe904) at glxext.c:408
#4  0x286643d1 in DoGetString (cl=0x87b5a80, pc=0x8792000 "\220\201\003", need_swap=0 '\0') at single2.c:334
#5  0x286645da in __glXDisp_GetString (cl=0x0, pc=0x0) at single2.c:393
#6  0x28644938 in __glXDispatch (client=0x8674a00) at glxext.c:561
#7  0x0814656d in XaceCatchExtProc (client=0x8674a00) at xace.c:299
#8  0x08087024 in Dispatch () at dispatch.c:502
#9  0x0806df1c in main (argc=7, argv=0xbfbfeebc, envp=0x0) at main.c:452


in summary, i found gdb backtrace to be misleading (fb=0x87e2000 ?), but it seems that the general sequence of events is _mesa_make_current(ctx1) -> _mesa_make_current(ctx2), this derefs buffers in ctx1 -> _mesa_make_current(ctx1) again, with buffers now NULL.
Comment 10 Brian Paul 2007-10-31 17:07:45 UTC
Could you try the Mesa 7.0.2 release candidate at www.mesa3d.org/beta/ ?

I made some changes in the oldCtx/_mesa_unreference_framebuffer() code a while back that may fix this.
Comment 11 Deomid Ryabkov 2007-10-31 18:42:11 UTC
yes, i recompiled xorg-server with MesaLib-7.0.2-rc1 and it fixed the crash.
thanks a lot, Brian!
Comment 12 Deomid Ryabkov 2007-10-31 18:53:23 UTC
oh, but please have a look at the double fault bug as well (segfault in the XkbEnableDisableControls in the SIGSEGV handler path).
Comment 13 Brian Paul 2007-11-01 07:13:03 UTC
The xkb issue should be reported as an xorg/xserver bug.  It's not a Mesa thing.
I'm closing this issue.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.