Bug 7205 - xorg-server 7.1 segfaults in swrast_Triangle (/usr/lib/xorg/modules/extensions/libGLcore.so)
Summary: xorg-server 7.1 segfaults in swrast_Triangle (/usr/lib/xorg/modules/extension...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: 6.5
Hardware: x86 (IA32) Linux (All)
: high critical
Assignee: mesa-dev
QA Contact:
URL: https://bugs.gentoo.org/show_bug.cgi?...
Whiteboard:
Keywords:
: 7608 11485 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-06-13 02:33 UTC by mrsteven
Modified: 2007-07-06 08:10 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
A log of the crash (49.63 KB, text/plain)
2006-06-13 02:35 UTC, mrsteven
Details
Log from blender lock axis crash (44.61 KB, text/plain)
2007-01-01 04:25 UTC, mikleh
Details
Log from blender rotate crash (43.61 KB, text/plain)
2007-01-01 04:26 UTC, mikleh
Details
Patch for Mesa's resize() function (1.26 KB, patch)
2007-01-23 11:45 UTC, Matthias Hopf
Details | Splinter Review
Clean stray framebuffer and glxPriv pointers in reused contexts (6.01 KB, patch)
2007-01-23 11:49 UTC, Matthias Hopf
Details | Splinter Review
Fixed patch for cleaning stray pointers (6.15 KB, patch)
2007-01-25 08:13 UTC, Matthias Hopf
Details | Splinter Review
Short test case for this issue (5.16 KB, text/plain)
2007-02-27 07:45 UTC, Matthias Hopf
Details

Description mrsteven 2006-06-13 02:33:08 UTC
This bug was posted to https://bugs.gentoo.org already: 
https://bugs.gentoo.org/show_bug.cgi?id=134395

The X server tends to crash in swrast_write_rgba_span, especially when you 
start glxgears and play around with the window (i.e. resize it). You might 
have to switch to console and back before that...

This happens on both computers with the radeon driver and without DRI. It 
works perfectly with X.org 7.0, but it segfaults on X.org 7.1.

Here's my backtrace:
0: /usr/bin/X(xf86SigHandler+0xa2) [0x80cb642]
1: [0xffffe420]
2: /usr/lib/xorg/modules/extensions/libGLcore.so(_swrast_Triangle+0x2a) 
[0xb38501fa]
3: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb386acba]
4: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb38897cd]
5: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb388a519]
6: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_run_pipeline+0x13d) 
[0xb387f2ad]
7: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_playback_vertex_list+0x2a3) 
[0xb388cc13]
8: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb37e8161]
9: /usr/lib/xorg/modules/extensions/libGLcore.so(_mesa_CallList+0x4e) 
[0xb37e863e]
10: /usr/lib/xorg/modules/extensions/libglx.so [0xb7bd3e46]
11: /usr/lib/xorg/modules/extensions/libglx.so(__glXRender+0xb0) [0xb7bcf780]
12: /usr/lib/xorg/modules/extensions/libglx.so [0xb7bd2c03]
13: /usr/bin/X(Dispatch+0x1c6) [0x80874c6]
14: /usr/bin/X(main+0x498) [0x806e3f8]
15: /lib/tls/libc.so.6(__libc_start_main+0xd4) [0xb7c9cf14]
16: /usr/bin/X(FontFileCompleteXLFD+0x81) [0x806d901]
Comment 1 mrsteven 2006-06-13 02:35:50 UTC
Created attachment 5893 [details]
A log of the crash

This is the log of a crash (/var/log/Xorg.0.log)
Comment 2 mrsteven 2006-06-13 02:39:58 UTC
Note that these crashes occur in different swrast* functions, but it is 
possible to reproduce them...

Sorry about the Summary confusion...
Comment 3 Donnie Berkholz 2006-06-13 09:03:06 UTC
This would probably do better assigned to Mesa.
Comment 4 mrsteven 2006-07-14 08:10:44 UTC
Since this is a reproducible crash, I better mark this as "critical" 
(according to the bugzilla help page)...
Comment 5 mrsteven 2006-07-25 15:19:03 UTC
*** Bug 7608 has been marked as a duplicate of this bug. ***
Comment 6 Alexey Spiridonov 2006-10-29 13:35:22 UTC
I have exactly the same problem -- Gentoo + xorg 7.1.  If I turn off DRI, this
invariably happens (the driver doesn't seem to matter -- both fglrx and radeon
do this). With DRI, I don't have that problem with either driver (although they
have other, unrelated issues which make DRI not always desirable).

The crash isn't always in swrast_* (for me, it's often in swrast_Triangle); I
think that there is some stack corruption happening. The bottom function
(FontFileCompleteXLFD) is always the same, and the backtrace is almost always
the same up to _tnl_run_pipeline. 

Can I help with diagnosis somehow? I'm capable of compiling a debug version of X
and running gdb on it, but I'd need some tips on the right set-up.

Alexey 
Comment 7 mikleh 2007-01-01 04:02:43 UTC
This is probably the same bug as Debian bug #404087.
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=404087) 
The bug makes Blender (http://www.blender.org/) difficult to use on my laptop
that has a graphics card without DRI. Blender crashes the X-server when I do
some actions. The first action is when I lock grab or resize to one axis. The
second action is when I rotate the view with the mouse. The first crash is
easily reproducible - it happens every time. The second crash only happens
sometimes.

Backtrace from locking grab or resize to one axis in Blender:
0: /usr/bin/X(xf86SigHandler+0x84) [0x80c4354]
1: [0xb7f6a420]
2: /usr/lib/xorg/modules/extensions/libGLcore.so (_swrast_write_rgba_span+0x89c)
[0xb70f0b1c]
3: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb70e523d]
4: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb70da7ec]
5: /usr/lib/xorg/modules/extensions/libGLcore.so(_swrast_Line+0x23) [0xb70d9c13]
6: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb711b2f1]
7: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_RenderClippedLine+0x23)
[0xb713c4b3]
8: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71350f3]
9: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71388da]
10: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb713c5b5]
11: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_run_pipeline+0x13f)
[0xb7122d8f]
12: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_flush_vtx+0x2c5) [0xb714b855]
13: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_FlushVertices+0x7c)
[0xb714891c]
14: /usr/lib/xorg/modules/extensions/libGLcore.so(_mesa_LoadMatrixf+0xa2)
[0xb707caa2]
15: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c33a86]
16: /usr/lib/xorg/modules/extensions/libglx.so(__glXRender+0xf3) [0xb7c2afd3]
17: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c2ff6a]
18: /usr/bin/X(Dispatch+0x19b) [0x8086cab]
19: /usr/bin/X(main+0x489) [0x806e699]
20: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xc8) [0xb7d6cea8]
21: /usr/bin/X(FontFileCompleteXLFD+0xa9) [0x806d9d1]

Backtrace from crash when rotating a view in blender:
0: /usr/bin/X(xf86SigHandler+0x84) [0x80c4354]
1: [0xb7fe7420]
2: /usr/lib/xorg/modules/extensions/libGLcore.so(_swrast_write_rgba_span+0x89c)
[0xb716db1c]
3: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb716223d]
4: /usr/lib/xorg/modules/extensions/libGLcore.so(_swrast_Line+0x23) [0xb7156c13]
5: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71982f1]
6: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_RenderClippedLine+0x23)
[0xb71b94b3]
7: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71b20f3]
8: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71b58da]
9: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71b95b5]
10: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_run_pipeline+0x13f)
[0xb719fd8f]
11: /usr/lib/xorg/modules/extensions/libGLcore.so(_tnl_flush_vtx+0x2c5) [0xb71c8855]
12: /usr/lib/xorg/modules/extensions/libGLcore.so [0xb71c5528]
13: /usr/lib/xorg/modules/extensions/libglx.so [0xb7cae720]
14: /usr/lib/xorg/modules/extensions/libglx.so(__glXRender+0xf3) [0xb7ca7fd3]
15: /usr/lib/xorg/modules/extensions/libglx.so [0xb7cacf6a]
16: /usr/bin/X(Dispatch+0x19b) [0x8086cab]
17: /usr/bin/X(main+0x489) [0x806e699]
18: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xc8) [0xb7de9ea8]
19: /usr/bin/X(FontFileCompleteXLFD+0xa9) [0x806d9d1]
Comment 8 mikleh 2007-01-01 04:25:59 UTC
Created attachment 8257 [details]
Log from blender lock axis crash
Comment 9 mikleh 2007-01-01 04:26:42 UTC
Created attachment 8258 [details]
Log from blender rotate crash
Comment 10 Matthias Hopf 2007-01-23 11:41:38 UTC
We at Novell had several bug reports concerning software rendering in the
Xserver. The one we consolidated them all is

  https://bugzilla.novell.com/show_bug.cgi?id=211314

I have debugged this for a long time now and finally was able to fix these
issues.  This bug report reads very much like having similar problems, so I'll
attach my findings here for evaluation.


Kevin, I'm adding you here because one of the patches is to be applied to the
Xgl part of the Xserver and you probably can comment on that much better than I do.
Comment 11 Matthias Hopf 2007-01-23 11:45:04 UTC
Created attachment 8487 [details] [review]
Patch for Mesa's resize() function

The Mesa resize() function is called, while the current context _glapi_Context
is NULL. As this isn't checked in the function I assume that this should never
happen.

Actually, it seems that while the context is correctly lost, the dispatch
functions aren't cleared correctly. Resizing windows was the only (well, almost

only) function that didn't check the validity of the glx context.


This patch is against Mesa git (applies to 6.5.2 or later).
Comment 12 Matthias Hopf 2007-01-23 11:49:57 UTC
Created attachment 8488 [details] [review]
Clean stray framebuffer and glxPriv pointers in reused contexts

Programs apparently crash if a context is reused for a window that changed in
the meantime. Crashes apparently only ocure for software rendering.

The basic problem is that a buffer is freed, when the according window is
closed, but the old context (which is no longer referenced in _glapi_Context
or in the according glxPriv structures) still has a reference to this buffer,
and is reactivated later. As the buffer points to a freed memory location, it
is no longer recognized as an old buffer (because buffer->Name - by accident -
contains 0x00cccccc) and not replaced.


About this patch:

The correct way to remove the reference would be to nuke it when a context is
disassociated from a drawable - because after that only the Xserver itself
(and not the Mesa subsystem) knows about the context which still has the stray
buffer and glxPriv pointers.
Unfortunately, if you do that, the server crashes in glxSwapBuffers() - I
don't known enough about that code to decide, whether this is a broken
implementation, a bug in glxSwapBuffers() (there might be other calls), broken
by design, or deliberately chosen this way and actually correct.

So the only way is to scan all contexts known for this client for references
to this buffer (and the according private structure) when the buffer is
actually destroyed. This is done in __glXMesaDrawableDestroy(), which has
access to all needed data - all but one.

Scanning all contexts only needs the client id (which is known), but the API
call FindClientResourcesByType() actually needs a ClientPtr - just to extract
the client id! The ClientPtr is only known in glxext.c in a static array
__glXClients. I don't know of a different API to get the ClientPtr from, so I
exported it. One other possibility would be to add a ClientPtr fetch call to
glxext.c, another one to create a new FindClientIDResourcesByType(), another
one to create a Client structure with just the id included (ugh!) just for
this API call. Making the array extern was the simplest thing to do, but is
not necessarily the right solution.

There is probably a better / right possibility to remove all pointers to
the buffer (MakeCurrent2?) - but I don't have a better idea ATM.

As I consider this a workaround / saveguard for some deeply involved buffer
handling bug, I added a lot of ErrorF()s that indicate these failures. If
removing these references at this point is the only valid solution, I'd gladly
remove these. As the scanning is only invoked on buffer destroys and doesn't
do a lot of scanning, performance is not an issue here.


This patch is against xserver git (applies to older versions as well).
Comment 13 Matthias Hopf 2007-01-23 11:51:41 UTC
Please test if one of the patches help in your cases.
Comment 14 Matthias Hopf 2007-01-25 08:13:53 UTC
Created attachment 8501 [details] [review]
Fixed patch for cleaning stray pointers

Forgot to check for potential NULL pointers. I think they shouldn't occur, but they do...
Comment 15 Matthias Hopf 2007-02-20 08:20:52 UTC
Brian, could you take a short (ahem!) look on this, as we discussed on XDC? Thanks!
Comment 16 Brian Paul 2007-02-21 07:22:29 UTC
I'll try to test this myself within the next few days.

I see that your first patch for "resize()" touches the FXgetImage() function which is only used with 3Dfx/glide (probably not relevant).

In the second patch, replace "Destory" with "Destroy".  But I haven't otherwise tested the patch.  Mesa's gl_framebuffer objects have a refcount field that should prevent dangling pointers, but perhaps that's failing somehow.

Finally, is Blender the best test case or are there other (smaller/simpler) ways of reproducing the bug?
Comment 17 mikleh 2007-02-22 12:56:22 UTC
I tried this (http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=commit;h=a1a0a29a5ad93be00989881055931e78941304a5) patch against mesa 6.5.1 and the Blender crashes disappeared. Now I'm using mesa 6.5.2 from Debian experimental and Blender works fine.
Comment 18 Matthias Hopf 2007-02-23 03:53:26 UTC
(In reply to comment #16)
> I'll try to test this myself within the next few days.

Thanks a lot, Brian!

> I see that your first patch for "resize()" touches the FXgetImage() function
> which is only used with 3Dfx/glide (probably not relevant).

Probably not, it just accessed ctx without checking for NULL (just like XMesaResizeBuffers in the same patch) so I updated it as well.

> 
> In the second patch, replace "Destory" with "Destroy".  But I haven't otherwise

Eeek :)

> tested the patch.  Mesa's gl_framebuffer objects have a refcount field that
> should prevent dangling pointers, but perhaps that's failing somehow.

Probably. This is a patch for the symptoms only, not for the real reason...

> Finally, is Blender the best test case or are there other (smaller/simpler)
> ways of reproducing the bug?

I tested with Matlab, because that was best reproducably for me, but that is closed source software...
We have a ton of similar bug reports, with various applications, you can find the closed duplicate messages in our bugzilla.
Comment 19 Matthias Hopf 2007-02-26 07:01:00 UTC
The patch in the git repository doesn't help with our issues, so they're probably uncorrelated. I'll try to create a simple test case.
Comment 20 Matthias Hopf 2007-02-27 06:32:23 UTC
Brain mailed me that he rewrote the buffer deletion code, but couldn't submit a message to bugzilla due to X.org bugzilla server issues.

AFAICS this is about git commits e6a9381f78605072cab52447fce35eaa98c1e75c and 928a70e4354d4884e2918ec67ddc6d8baf942c8a.

I'm just testing these patches against the Mesa version we use in our products.
Comment 21 Matthias Hopf 2007-02-27 07:42:04 UTC
Unfortunately, the patches do *not* fix the problem (at least when applied to 6.5.1). Still evaluating.
Comment 22 Matthias Hopf 2007-02-27 07:45:04 UTC
Created attachment 8884 [details]
Short test case for this issue

This is the smallest test program I found so far that exhibits this problem, written by Robert Schweikert.

It doesn't crash on my system (it does on Robert's), but when I apply my stray pointer cleanup patch, it tells me that a *lot* of pointers aren't cleaned up correctly.
Comment 23 Brian Paul 2007-02-27 08:57:40 UTC
I guess I'm not 100% clear on the issue at this point.  Is Blender still crashing sometimes?  Do any apps besides MatLab trigger the bug?

Are people using the Mesa git head, or a patched version of 6.5.1?

The 8884 test program works for me here, btw.
Comment 24 Matthias Hopf 2007-02-27 09:39:37 UTC
matlab still crashes with git commits a1a0a29a5ad93be00989881055931e78941304a5, e6a9381f78605072cab52447fce35eaa98c1e75c, 928a70e4354d4884e2918ec67ddc6d8baf942c8a, and my patch in attachment #8487 [details] [review] applied. The crash looks like before, a framebuffer object is addressed which has obviously already been freed (because it is filled with obviously wrong pointers and values).

The attached test case never crashed for me (only for the contributor), but exhibits the same stray pointers when I use my patch from attachment #8501 [details] [review].

I never tested blender, but at least one comment here indicates that git commit a1a0a29a5ad93be00989881055931e78941304a5 fixed this issue. I'm waiting for verification by a customer of us who had the same issue.


This is all for Mesa 6.5.1, I'm currently building an Xserver with current git Mesa (which is not as trivial because it has to run on a SuSE 10.2 ;-) to verify whether together with some additional changes it fixes the problem. I also wanted to verify the test case with the server running under valgrind.
Comment 25 Brian Paul 2007-02-27 10:15:42 UTC
OK, If we're deleting a framebuffer object while someone is still pointing to it, this assertion might help catch it:

--- a/src/mesa/main/framebuffer.c
+++ b/src/mesa/main/framebuffer.c
@@ -216,6 +216,7 @@ _mesa_free_framebuffer_data(struct gl_fr
    GLuint i;
 
    assert(fb);
+   assert(fb->RefCount == 0);
 
    _glthread_DESTROY_MUTEX(fb->Mutex);

See if that does anything.

Also, I don't see a stack trace from the MatLab crash.  Can you get one?

I looked at downloading a MatLab demo but it looks like a PITA.
Comment 26 Matthias Hopf 2007-02-27 10:32:55 UTC
(In reply to comment #25)
> OK, If we're deleting a framebuffer object while someone is still pointing to
> it, this assertion might help catch it:

Will try that.
 
> Also, I don't see a stack trace from the MatLab crash.  Can you get one?

You can find some in https://bugzilla.novell.com/show_bug.cgi?id=211314 , comment 44 and 48. The first from pretty current version (but with -O2), the later is an older version but w/o optimization. I will create a new one with current Mesa code.

> I looked at downloading a MatLab demo but it looks like a PITA.

You don't want to do that. Believe me.
Comment 27 Matthias Hopf 2007-03-02 06:02:27 UTC
(In reply to comment #26)
> > OK, If we're deleting a framebuffer object while someone is still pointing to
> > it, this assertion might help catch it:

So far it wasn't hit for me.

(In reply to comment #24)
> The attached test case never crashed for me (only for the contributor), but
> exhibits the same stray pointers when I use my patch from attachment #8501 [details] [review].

Tested with current git, both Mesa and complete X environment.

Valgrind shows few warnings when running the test case, and seems to freeze (100% CPU). It probably has to emulate and verify a lot of memory accesses, or I hit a bug in valgrind:

==15227== 
==15227== Syscall param writev(vector[...]) points to uninitialised byte(s)
==15227==    at 0x4260492: do_writev (in /lib/libc-2.5.so)
==15227==    by 0x81B5A7D: _XSERVTransSocketWritev (Xtranssock.c:2192)
==15227==    by 0x81B477E: _XSERVTransWritev (Xtrans.c:914)
==15227==    by 0x81AF969: FlushClient (io.c:1057)
==15227==    by 0x81B00F3: FlushAllOutput (io.c:809)
==15227==    by 0x808901E: Dispatch (dispatch.c:525)
==15227==    by 0x80707A4: main (main.c:469)
==15227==  Address 0x63FAB3C is 36 bytes inside a block of size 4,096 alloc'd
==15227==    at 0x40233F0: malloc (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==15227==    by 0x81B1C96: Xalloc (utils.c:1351)
==15227==    by 0x81AF596: AllocateOutputBuffer (io.c:1197)
==15227==    by 0x81AFE3F: WriteToClient (io.c:916)
==15227==    by 0x8082626: SendConnSetup (dispatch.c:3940)
==15227==    by 0x808273F: ProcEstablishConnection (dispatch.c:3988)
==15227==    by 0x8088FBE: Dispatch (dispatch.c:503)
==15227==    by 0x80707A4: main (main.c:469)
Cleaning up stray priv pointers to 0x661ef28 in context 0x63fcac0
Cleaning up stray xmesa pointers to buffer 0x6a4e5c8 in context 0x63fcac0
==15227== Warning: set address range perms: large range 134217728 (undefined)
Comment 28 Matthias Hopf 2007-03-13 09:51:37 UTC
With pretty current git (63169ce52d354b4345dcfc46b89f0ea88379718f 2007-03-08 08:20:19), the Xserver (this time the whole suite compiled with -O0) segfaults with the test case, at least on x86_64:


Program received signal SIGSEGV, Segmentation fault.
0x00002b015de121e9 in xmesa_delete_framebuffer (fb=0x1259310) at xm_buffer.c:368

#0  0x00002b015de121e9 in xmesa_delete_framebuffer (fb=0x1259310) at xm_buffer.c:368
#1  0x00002b015dbdeac6 in _mesa_unreference_framebuffer (fb=0xac0838) at framebuffer.c:292
#2  0x00002b015dc03160 in _mesa_free_context_data (ctx=0xac0660) at context.c:1416
#3  0x00002b015de1067c in XMesaDestroyContext (c=0xac0660) at xm_api.c:1586
#4  0x00002b015de00247 in __glXMesaContextDestroy (baseContext=0xab2460) at xf86glx.c:187
#5  0x00002b015b857eac in __glXFreeContext (cx=0xab2460) at glxext.c:242
#6  0x00002b015b857c21 in ContextGone (cx=0xab2460, id=2097156) at glxext.c:121
#7  0x000000000043547e in FreeResourceByType (id=2097156, type=33, skipFree=0) at resource.c:622
#8  0x00002b015b84f04d in __glXDisp_DestroyContext (cl=0xab1790, pc=0xaaf500 "\217\004\002") at glxcmds.c:288
#9  0x00002b015b858590 in __glXDispatch (client=0xaaf100) at glxext.c:551
#10 0x0000000000564941 in XaceCatchExtProc (client=0xaaf100) at xace.c:299
#11 0x000000000044bd41 in Dispatch () at dispatch.c:503
#12 0x0000000000432949 in main (argc=1, argv=0x7fff5145ce08, envp=0x7fff5145ce18) at main.c:467


frame #0:  xm_buffer.c:368

363        XMesaBuffer b = XMESA_BUFFER(fb);
364
365     #ifdef XFree86Server
366        int client = 0;
367        if (b->frontxrb->drawable)
368            client = CLIENT_ID(b->frontxrb->drawable->id);
369     #endif
370
371        if (b->num_alloced > 0) {
372           /* If no other buffer uses this X colormap then free the colors. */


frame #1:   framebuffer.c:292

287           (*fb)->RefCount--;
288           deleteFlag = ((*fb)->RefCount == 0);
289           _glthread_UNLOCK_MUTEX((*fb)->Mutex);
290           
291           if (deleteFlag)
292              (*fb)->Delete(*fb);
293
294           *fb = NULL;
295        }
296     }


(gdb) print b->frontxrb->drawable
$2 = (XMesaDrawable) 0x2b015e136010
(gdb) print b->frontxrb->drawable->id
Cannot access memory at address 0x2b015e136018

(gdb) print b->frontxrb
$10 = (struct xmesa_renderbuffer *) 0xfd1620
(gdb) print *b->frontxrb
$11 = {Base = {Mutex = 0, ClassID = 0, Name = 0, RefCount = 2, Width = 4096, Height = 4096, InternalFormat = 6408, 
    _ActualFormat = 0, _BaseFormat = 6408, DataType = 5121, RedBits = 5 '\005', GreenBits = 6 '\006', 
    BlueBits = 5 '\005', AlphaBits = 8 '\b', IndexBits = 0 '\0', DepthBits = 0 '\0', StencilBits = 0 '\0', Data = 0x0, 
    Wrapped = 0xfd1620, Delete = 0x2b015de11e11 <xmesa_delete_renderbuffer>, 
    AllocStorage = 0x2b015de11e28 <xmesa_alloc_front_storage>, GetPointer = 0x2b015dc397e5 <nop_get_pointer>, 
    GetRow = 0x2b015ddae8ec <get_row_rgba>, GetValues = 0x2b015ddb0346 <get_values_rgba>, 
    PutRow = 0x2b015dd97d13 <put_row_DITHER_5R6G5B_pixmap>, 
    PutRowRGB = 0x2b015dd983c5 <put_row_rgb_DITHER_5R6G5B_pixmap>, 
    PutMonoRow = 0x2b015dda9b29 <put_mono_row_TRUEDITHER_pixmap>, 
    PutValues = 0x2b015dda63b4 <put_values_DITHER_5R6G5B_pixmap>, 
    PutMonoValues = 0x2b015ddabc56 <put_mono_values_TRUEDITHER_pixmap>}, Parent = 0x1259310, 
  drawable = 0x2b015e136010, pixmap = 0x2b015e136010, ximage = 0x0, origin1 = 0x0, width1 = 0, origin2 = 0x0, 
  width2 = 0, origin3 = 0x0, width3 = 0, origin4 = 0x0, width4 = 0, bottom = 4095, 
  clearFunc = 0x2b015de00eeb <clear_pixmap>}
Comment 29 Matthias Hopf 2007-03-13 10:04:08 UTC
Actually, the Xserver crashes now even with DRI loaded (but unused, haven't tested with DRI used yet). So unfortunately, we have a regression now. :-(
Comment 30 Brian Paul 2007-03-14 10:54:10 UTC
I guess the b->frontxrb->drawable field is invalid by time we get there.

I've checked in two changes to the Mesa xlib driver that should help with this.  Give it a try.
Comment 31 Matthias Hopf 2007-03-15 08:55:10 UTC
Situation improves, but isn't fixed yet.
Next crash with current mesa+xserver git:

Program received signal SIGSEGV, Segmentation fault.
265              dst[i] = val;

#0  0x00002b28d91e0f2a in put_mono_row_ushort (ctx=0xbe53e0, rb=0x10dff90, count=0x1000, x=0x0, y=0x0, value=0x7fffd5eb233a, mask=0x0) at renderbuffer.c:265
#1  0x00002b28d929bcc3 in _swrast_clear_depth_buffer (ctx=0xbe53e0, rb=0x10dff90) at s_depth.c:1415
#2  0x00002b28d92988a3 in _swrast_Clear (ctx=0xbe53e0, buffers=0x101) at s_buffers.c:334
#3  0x00002b28d93abc09 in clear_buffers (ctx=0xbe53e0, buffers=0x101) at xm_dd.c:424
#4  0x00002b28d91e5ef2 in _mesa_Clear (mask=0x4100) at buffers.c:176
#5  0x00002b28d6e06a00 in __glXDisp_Clear (pc=0xab869c "") at indirect_dispatch.c:1337
#6  0x00002b28d6dfd1b6 in DoRender (cl=0xab96a0, pc=0xab8698 "\b", do_swap=0x0) at glxcmds.c:1802
#7  0x00002b28d6dfd20c in __glXDisp_Render (cl=0xab96a0, pc=0xab8690 "\217\001\004") at glxcmds.c:1816
#8  0x00002b28d6e03590 in __glXDispatch (client=0xab82a0) at glxext.c:551
#9  0x0000000000567195 in XaceCatchExtProc (client=0xab82a0) at xace.c:299
#10 0x000000000044c061 in Dispatch () at dispatch.c:503
#11 0x0000000000432c69 in main (argc=0x1, argv=0x7fffd5eb2b48, envp=0x7fffd5eb2b58) at main.c:467


renderbuffer.c:265 :
262        else {
263           GLuint i;
264           for (i = 0; i < count; i++) {
265              dst[i] = val;
266           }
267        }

(gdb) print dst
$1 = (GLushort *) 0x0
(gdb) print *rb
$3 = {Mutex = 0x0, ClassID = 0x0, Name = 0x0, RefCount = 0x3, Width = 0x0, Height = 0x0, InternalFormat = 0x81a5, _ActualFormat = 0x81a5, _BaseFormat = 0x1902, DataType = 0x1403, RedBits = 0x0, GreenBits = 0x0, BlueBits = 0x0, AlphaBits = 0x0, IndexBits = 0x0, DepthBits = 0x10, StencilBits = 0x0, Data = 0x0, Wrapped = 0x10dff90, Delete = 0x2b28d91e3c1c <_mesa_delete_renderbuffer>, AllocStorage = 0x2b28d91e286f <_mesa_soft_renderbuffer_storage>, GetPointer = 0x2b28d91e0c7d <get_pointer_ushort>, GetRow = 0x2b28d91e0cdc <get_row_ushort>, GetValues = 0x2b28d91e0d2c <get_values_ushort>, PutRow = 0x2b28d91e0dc4 <put_row_ushort>, PutRowRGB = 0, PutMonoRow = 0x2b28d91e0e7a <put_mono_row_ushort>, PutValues = 0x2b28d91e0f3b <put_values_ushort>, PutMonoValues = 0x2b28d91e0fe5 <put_mono_values_ushort>}
Comment 32 Brian Paul 2007-03-15 09:12:48 UTC
OK, I've checked in some new null ptr checks.  Though, I might add some higher-level checks to see if we're being asked to clear a buffer with width=height=0 and no-op it...
Comment 33 Matthias Hopf 2007-03-16 05:34:16 UTC
This seems to be an endless endavour ;-)

Program received signal SIGSEGV, Segmentation fault.
#0  0x00002b5559c362fb in memset () from /lib64/libc.so.6
#1  0x00002b556c11b960 in _mesa_memset (dst=0x0, val=0x0, n=0x1000) at imports.c:248
#2  0x00002b556c18f856 in put_mono_row_alpha8 (ctx=0xb17520, arb=0x1028b30, count=0x1000, x=0x0, y=0x0, value=0x7fff522a1710, mask=0x0) at renderbuffer.c:1387
#3  0x00002b556c244591 in clear_rgba_buffer (ctx=0xb17520, rb=0x1028b30) at s_buffers.c:189
#4  0x00002b556c244806 in clear_color_buffers (ctx=0xb17520) at s_buffers.c:281
#5  0x00002b556c2448ac in _swrast_Clear (ctx=0xb17520, buffers=0x101) at s_buffers.c:331
#6  0x00002b556c357c7d in clear_buffers (ctx=0xb17520, buffers=0x101) at xm_dd.c:424
#7  0x00002b556c191ef5 in _mesa_Clear (mask=0x4100) at buffers.c:179
#8  0x00002b555aa14a00 in __glXDisp_Clear (pc=0xb066cc "") at indirect_dispatch.c:1337
#9  0x00002b555aa0b1b6 in DoRender (cl=0xb089a0, pc=0xb066c8 "\b", do_swap=0x0) at glxcmds.c:1802
#10 0x00002b555aa0b20c in __glXDisp_Render (cl=0xb089a0, pc=0xb066c0 "\217\001\004") at glxcmds.c:1816
#11 0x00002b555aa11590 in __glXDispatch (client=0xb062a0) at glxext.c:551
#12 0x00000000005671d1 in XaceCatchExtProc (client=0xb062a0) at xace.c:299
#13 0x000000000044c061 in Dispatch () at dispatch.c:503
#14 0x0000000000432c69 in main (argc=0x1, argv=0x7fff522a1f38, envp=0x7fff522a1f48) at main.c:467

(gdb) frame 2
#2  0x00002b556c18f856 in put_mono_row_alpha8 (ctx=0xb17520, arb=0x1028b30, count=0x1000, x=0x0, y=0x0, value=0x7fff522a1710, mask=0x0) at renderbuffer.c:1387
1387          _mesa_memset(dst, val, count);
(gdb) print dst
$1 = (GLubyte *) 0x0


Now with the following patch the test program doesn't crash any longer:

diff --git a/src/mesa/swrast/s_buffers.c b/src/mesa/swrast/s_buffers.c
index 35f2dd6..d3bf6bf 100644
--- a/src/mesa/swrast/s_buffers.c
+++ b/src/mesa/swrast/s_buffers.c
@@ -273,6 +273,8 @@ clear_color_buffers(GLcontext *ctx)
 
    for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers[0]; i++) {
       struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0][i];
+      if (!rb || !rb->Data)
+         continue;
       if (ctx->Visual.rgbMode) {
          if (masking) {
             clear_rgba_buffer_with_masking(ctx, rb);

So please consider applying.

Now testing matlab.

Comment 34 Matthias Hopf 2007-03-16 06:05:33 UTC
matlab still crashes:

xm_api.c:2502       xmesa_check_and_update_buffer_size(xmctx, b);

(gdb) bt
#0  0x00002af3d4ae0d2a in xmesa_check_and_update_buffer_size (xmctx=0x0, drawBuffer=0x11218c0) at xm_api.c:1850
#1  0x00002af3d4ae1cfd in XMesaResizeBuffers (b=0x11218c0) at xm_api.c:2502
#2  0x00002af3d4ad04aa in __glXMesaDrawableResize (base=0x1114480) at xf86glx.c:113
#3  0x00002af3c318c5a9 in PositionWindow (pWin=0xbe5ae0, x=0xff, y=0x4d) at glxscreens.c:230
#4  0x000000000043a917 in ResizeChildrenWinSize (pWin=0xbe3420, dx=0x0, dy=0x0, dw=0x0, dh=0xfffffff4) at window.c:1859
#5  0x00000000005423e0 in miSlideAndSizeWindow (pWin=0xbe3420, x=0xff, y=0x17, w=0x200, h=0x1aa, pSib=0x10dc1e0) at miwi
ndow.c:728
#6  0x000000000043bf83 in ConfigureWindow (pWin=0xbe3420, mask=0x8, vlist=0x11574f0, client=0xbafd80) at window.c:2485
#7  0x000000000044cb4e in ProcConfigureWindow (client=0xbafd80) at dispatch.c:808
#8  0x0000000000567115 in XaceCatchDispatchProc (client=0xbafd80) at xace.c:281
#9  0x000000000044c061 in Dispatch () at dispatch.c:503
#10 0x0000000000432c69 in main (argc=0x1, argv=0x7fffe9b2a7b8, envp=0x7fffe9b2a7c8) at main.c:467


The following patch fixes this (apparently last) issue:

diff --git a/src/mesa/drivers/x11/xm_api.c b/src/mesa/drivers/x11/xm_api.c
index cbbbd56..ba020fc 100644
--- a/src/mesa/drivers/x11/xm_api.c
+++ b/src/mesa/drivers/x11/xm_api.c
@@ -2499,6 +2499,8 @@ XMesaResizeBuffers( XMesaBuffer b )
 {
    GET_CURRENT_CONTEXT(ctx);
    XMesaContext xmctx = XMESA_CONTEXT(ctx);
+   if (!xmctx)
+      return;
    xmesa_check_and_update_buffer_size(xmctx, b);
 }
 

Will have to test against older products, but as far as I can see, with these two additional patches the issue is fixed for me :^)

Thanks, Brian!
Comment 35 Brian Paul 2007-03-16 06:34:52 UTC
The fix in comment #34 is OK, but the change in comment #33 breaks things.

I just tried a test case and confirmed that the fix in #33 is incorrect.  I'll have to double-check my !rb->Data changes from Mar 15 as well.

Checking rb->Data != NULL isn't the correct way to tell if a renderbuffer is valid since some renderbuffer types may never use the rb->Data pointer.  An example is the front renderbuffer for the normal xlib driver: the rb->Data pointer isn't used since we use an X Drawable handle instead to point to the window we're drawing into.

In this code:

    for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers[0]; i++) {
       struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0][i];
+      if (!rb || !rb->Data)
+         continue;

can you tell me if rb or rb->Data is null?

If rb is non-null, can you print *rb in gdb?  Also, what is the value of the array index 'i'?

Thanks.
Comment 36 Brian Paul 2007-03-16 06:43:36 UTC
Additionally, could you also (at the same point) print *ctx->DrawBuffer and *((struct xmesa_renderbuffer *)rb) ?  Thanks.
Comment 37 Matthias Hopf 2007-03-16 07:02:23 UTC
(In reply to comment #35)
> I just tried a test case and confirmed that the fix in #33 is incorrect.  I'll
> have to double-check my !rb->Data changes from Mar 15 as well.
> Checking rb->Data != NULL isn't the correct way to tell if a renderbuffer is
> valid since some renderbuffer types may never use the rb->Data pointer.  An

Ok. Yes, I see, I missed one indirect function call, and moved up too far in the stack.

Basically,

diff --git a/src/mesa/main/renderbuffer.c b/src/mesa/main/renderbuffer.c
index 1cc95a7..736c8e8 100644
--- a/src/mesa/main/renderbuffer.c
+++ b/src/mesa/main/renderbuffer.c
@@ -1374,6 +1374,8 @@ put_mono_row_alpha8(GLcontext *ctx, stru
    ASSERT(arb->DataType == GL_UNSIGNED_BYTE);
    /* first, pass the call to the wrapped RGB buffer */
    arb->Wrapped->PutMonoRow(ctx, arb->Wrapped, count, x, y, value, mask);
+   if (!arb->Data)
+      return;
    /* second, store alpha in our buffer */
    if (mask) {
       GLuint i;

will have the right effect, but it still might obscure a bug in a layer above.

> In this code:
> 
>     for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers[0]; i++) {
>        struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0][i];
> +      if (!rb || !rb->Data)
> +         continue;
> 
> can you tell me if rb or rb->Data is null?

The backtrace already indicated:

#3  0x00002b556c244591 in clear_rgba_buffer (ctx=0xb17520, rb=0x1028b30) at
s_buffers.c:189

rb is not null, but rb->Data is.

> If rb is non-null, can you print *rb in gdb?  Also, what is the value of the
> array index 'i'?

Will do. i, AFAIR, was 0.
Comment 38 Matthias Hopf 2007-03-16 08:09:36 UTC
I reverted the change of comment #33, and interestingly matlab didn't crash - just the test case. 

Anyway, the information for you:

(gdb) bt
#0  0x00002b2f8fb632fb in memset () from /lib64/libc.so.6
#1  0x00002b2f92caa960 in _mesa_memset (dst=0x0, val=0x0, n=0x1000) at imports.c:248
#2  0x00002b2f92d1e856 in put_mono_row_alpha8 (ctx=0x10df7a0, arb=0x1153dc0, count=0x1000, x=0x0, y=0x0, value=0x7fff1c3747e0, mask=0x0) at renderbuffer.c:1387
#3  0x00002b2f92dd3591 in clear_rgba_buffer (ctx=0x10df7a0, rb=0x1153dc0) at s_buffers.c:189
#4  0x00002b2f92dd3806 in clear_color_buffers (ctx=0x10df7a0) at s_buffers.c:281
#5  0x00002b2f92dd38ac in _swrast_Clear (ctx=0x10df7a0, buffers=0x101) at s_buffers.c:331
#6  0x00002b2f92ee6c7d in clear_buffers (ctx=0x10df7a0, buffers=0x101) at xm_dd.c:424
[...]

(gdb) frame 4
#4  0x00002b2f92dd3806 in clear_color_buffers (ctx=0x10df7a0) at s_buffers.c:281
281                 clear_rgba_buffer(ctx, rb);
(gdb) print rb
$2 = (struct gl_renderbuffer *) 0x1153dc0
(gdb) print i
$3 = 0x0

(gdb) print *rb
$4 = {Mutex = 0x0, ClassID = 0x0, Name = 0x0, RefCount = 0x2, Width = 0x0, Height = 0x0, InternalFormat = 0x1908, _ActualFormat = 0x803c, _BaseFormat = 0x1908, DataType = 0x1401, RedBits = 0x0, GreenBits = 0x0, BlueBits = 0x0, AlphaBits = 0x0, IndexBits = 0x0, DepthBits = 0x0, StencilBits = 0x0, Data = 0x0, Wrapped = 0x12c34a0, Delete = 0x2b2f92d1e3b6 <delete_renderbuffer_alpha8>, AllocStorage = 0x2b2f92d1e2cc <alloc_storage_alpha8>, GetPointer = 0x2b2f92d1e409 <get_pointer_alpha8>, GetRow = 0x2b2f92d1e422 <get_row_alpha8>, GetValues = 0x2b2f92d1e4de <get_values_alpha8>, PutRow = 0x2b2f92d1e5b0 <put_row_alpha8>, PutRowRGB = 0x2b2f92d1e691 <put_row_rgb_alpha8>, PutMonoRow = 0x2b2f92d1e772 <put_mono_row_alpha8>, PutValues = 0x2b2f92d1e85d <put_values_alpha8>, PutMonoValues = 0x2b2f92d1e955 <put_mono_values_alpha8>}

(gdb) print *ctx->DrawBuffer
$5 = {Mutex = 0x0, Name = 0x0, RefCount = 0x5, DeletePending = 0x0, Visual = {next = 0x0, rgbMode = 0x1, floatMode = 0x0, colorIndexMode = 0x0, doubleBufferMode = 0x0, stereoMode = 0x0, haveAccumBuffer = 0x1, haveDepthBuffer = 0x1, haveStencilBuffer = 0x1, redBits = 0x5, greenBits = 0x6, blueBits = 0x5, alphaBits = 0x8, redMask = 0xf800, greenMask = 0x7e0, blueMask = 0x1f, alphaMask = 0x0, rgbBits = 0x10, indexBits = 0x0, accumRedBits = 0x10, accumGreenBits = 0x10, accumBlueBits = 0x10, accumAlphaBits = 0x10, depthBits = 0x10, stencilBits = 0x8, numAuxBuffers = 0x0, level = 0x0, pixmapMode = 0x0, visualID = 0x25, visualType = 0x8002, visualRating = 0x8000, transparentPixel = 0x0, transparentRed = 0x0, transparentGreen = 0x0, transparentBlue = 0x0, transparentAlpha = 0x0, transparentIndex = 0x0, sampleBuffers = 0x0, samples = 0x0, drawableType = 0x0, renderType = 0x0, xRenderable = 0x0, fbconfigID = 0x0, maxPbufferWidth = 0x0, maxPbufferHeight = 0x0, maxPbufferPixels = 0x0, optimalPbufferWidth = 0x0, optimalPbufferHeight = 0x0, visualSelectGroup = 0x0, swapMethod = 0x0, screen = 0x0, bindToTextureRgb = 0x0, bindToTextureRgba = 0x0, bindToMipmapTexture = 0x0, bindToTextureTargets = 0x0, yInverted = 0x0}, Initialized = 0x1, Width = 0x1000, Height = 0x1000, _Xmin = 0x0, _Xmax = 0x1000, _Ymin = 0x0, _Ymax = 0x1000, _DepthMax = 0xffff, _DepthMaxF = 65535, _MRD = 1, _Status = 0x8cd5, Attachment = {{Type = 0x8d41, Complete = 0x1, Renderbuffer = 0x1153dc0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x8d41, Complete = 0x1, Renderbuffer = 0x12c35a0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x8d41, Complete = 0x1, Renderbuffer = 0x12c3640, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x8d41, Complete = 0x1, Renderbuffer = 0x12c36e0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}, {Type = 0x0, Complete = 0x0, Renderbuffer = 0x0, Texture = 0x0, TextureLevel = 0x0, CubeMapFace = 0x0, Zoffset = 0x0}}, ColorDrawBuffer = {0x404}, ColorReadBuffer = 0x404, _ColorDrawBufferMask = {0x1}, _ColorReadBufferIndex = 0x0, _NumColorDrawBuffers = {0x1}, _ColorDrawBuffers = {{0x1153dc0, 0x0, 0x0, 0x0}}, _ColorReadBuffer = 0x1153dc0, _DepthBuffer = 0x12c35a0, _StencilBuffer = 0x12c3640, Delete = 0x2b2f92ef75b4 <xmesa_delete_framebuffer>}

(gdb) print *((struct xmesa_renderbuffer *)rb)
$6 = {Base = {Mutex = 0x0, ClassID = 0x0, Name = 0x0, RefCount = 0x2, Width = 0x0, Height = 0x0, InternalFormat = 0x1908, _ActualFormat = 0x803c, _BaseFormat = 0x1908, DataType = 0x1401, RedBits = 0x0, GreenBits = 0x0, BlueBits = 0x0, AlphaBits = 0x0, IndexBits = 0x0, DepthBits = 0x0, StencilBits = 0x0, Data = 0x0, Wrapped = 0x12c34a0, Delete = 0x2b2f92d1e3b6 <delete_renderbuffer_alpha8>, AllocStorage = 0x2b2f92d1e2cc <alloc_storage_alpha8>, GetPointer = 0x2b2f92d1e409 <get_pointer_alpha8>, GetRow = 0x2b2f92d1e422 <get_row_alpha8>, GetValues = 0x2b2f92d1e4de <get_values_alpha8>, PutRow = 0x2b2f92d1e5b0 <put_row_alpha8>, PutRowRGB = 0x2b2f92d1e691 <put_row_rgb_alpha8>, PutMonoRow = 0x2b2f92d1e772 <put_mono_row_alpha8>, PutValues = 0x2b2f92d1e85d <put_values_alpha8>, PutMonoValues = 0x2b2f92d1e955 <put_mono_values_alpha8>}, Parent = 0x0, drawable = 0x31, pixmap = 0x12c3770, ximage = 0x2b2f8fe2ba10, origin1 = 0xf00000002 <Address 0xf00000002 out of bounds>, width1 = 0x0, origin2 = 0x30, width2 = 0x100, origin3 = 0x0, width3 = 0x0, origin4 = 0x100000001000, width4 = 0x1908, bottom = 0x0, clearFunc = 0x140100001908}

Some more structures:
(gdb) print *((struct xmesa_renderbuffer *)rb)->pixmap
$8 = {drawable = {type = 0x0, class = 0x0, depth = 0x0, bitsPerPixel = 0x0, pad0 = 0x0, id = 0x31, pad1 = 0x0, x = 0x3410, y = 0xbf, width = 0x0, height = 0x0, pScreen = 0x1153e50, serialNumber = 0xf00000002}, refcnt = 0x0, devKind = 0x0, devPrivate = {ptr = 0x30, val = 0x30, uval = 0x30, fptr = 0x30}, devPrivates = 0xa0, screen_x = 0x0, screen_y = 0x0}
(gdb) print *((struct xmesa_renderbuffer *)rb)->ximage
$9 = {width = 0x8fe2ba00, height = 0x2b2f, data = 0x2b2f8fe2ba00 "�\217/+", bytes_per_line = 0x1153e50, bits_per_pixel = 0x0}
(gdb) print *rb->Wrapped 
$12 = {Mutex = 0x0, ClassID = 0x0, Name = 0x0, RefCount = 0x2, Width = 0x1000, Height = 0x1000, InternalFormat = 0x1908, _ActualFormat = 0x0, _BaseFormat = 0x1908, DataType = 0x1401, RedBits = 0x5, GreenBits = 0x6, BlueBits = 0x5, AlphaBits = 0x8, IndexBits = 0x0, DepthBits = 0x0, StencilBits = 0x0, Data = 0x0, Wrapped = 0x12c34a0, Delete = 0x2b2f92ef7225 <xmesa_delete_renderbuffer>, AllocStorage = 0x2b2f92ef723c <xmesa_alloc_front_storage>, GetPointer = 0x2b2f92d1ea45 <nop_get_pointer>, GetRow = 0x2b2f92e93cdc <get_row_rgba>, GetValues = 0x2b2f92e95736 <get_values_rgba>, PutRow = 0x2b2f92e7d103 <put_row_DITHER_5R6G5B_pixmap>, PutRowRGB = 0x2b2f92e7d7b5 <put_row_rgb_DITHER_5R6G5B_pixmap>, PutMonoRow = 0x2b2f92e8ef19 <put_mono_row_TRUEDITHER_pixmap>, PutValues = 0x2b2f92e8b7a4 <put_values_DITHER_5R6G5B_pixmap>, PutMonoValues = 0x2b2f92e91046 <put_mono_values_TRUEDITHER_pixmap>}

Comment 39 Brian Paul 2007-03-16 09:03:41 UTC
Thanks for the detailed debug info.  I've checked in a number of changes to git that fix issues related to frame/renderbuffer resizing and dealing with renderbuffers whose size is 0 by 0.

Not 100% sure they'll solve the crash cases but they should be another step in the right direction.  In my own testing this solved a few issues.
Comment 40 Papadakos Panagiotis 2007-03-16 15:44:43 UTC
(In reply to comment #39)
> Thanks for the detailed debug info.  I've checked in a number of changes to git
> that fix issues related to frame/renderbuffer resizing and dealing with
> renderbuffers whose size is 0 by 0.
> 
> Not 100% sure they'll solve the crash cases but they should be another step in
> the right direction.  In my own testing this solved a few issues.
> 

Well with commit http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=commit;h=4d2eb637a20e4fdf5d5f6c0ea4d4627894594661
when I rotate my camera, I see many instances of my world, from different angles, probably because things do not clear.
Comment 41 Brian Paul 2007-03-16 15:51:43 UTC
Are you using the very latest git code with my assorted fixes for zero width/height buffers?

If you remove the change from
http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=commit;h=4d2eb637a20e4fdf5d5f6c0ea4d4627894594661
do things work properly?

Finally, can I try your app?
Comment 42 Papadakos Panagiotis 2007-03-18 08:48:40 UTC
Upgraded to latest git code and it works. Sorry for the false alarm.
(In reply to comment #41)
> Are you using the very latest git code with my assorted fixes for zero
> width/height buffers?
> 
> If you remove the change from
> http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=commit;h=4d2eb637a20e4fdf5d5f6c0ea4d4627894594661
> do things work properly?
> 
> Finally, can I try your app?
> 
Comment 43 Matthias Hopf 2007-03-19 07:25:57 UTC
With your latest fixes both matlab and the test case seem to run stable. So far so good. Though we have a major memory leak now:

After running the test case the Xserver has allocated some 100MB of memory, not showing up in xrestop. This is with current git. With my patched xserver-1.2 and Mesa 6.5.2 it behaves similarly, though not identically.

During the test case, when the Xserver has wasted all resources (400MB resident) after approx 9 runs in the inner loop, it returns instantaneously from glClear() (instead of taking about 5secs per run), without error message. Further calls to the test case return instantaneously as well. Only a complete server restart helps in that case.

I added some calls to glGetError(), and it is set to GL_OUT_OF_MEMORY (how come) on glXMakeCurrent() - though that call actually doesn't fail.


When interrupting the test case several times, it will also fail on the next invocation with

  Error of failed request:  BadAlloc (insufficient resources for operation)
  Major opcode of failed request:  53 (X_CreatePixmap)
  Serial number of failed request:  31
  Current serial number in output stream:  33

on glXMakeCurrent().
Comment 44 Brian Paul 2007-03-20 13:38:25 UTC
Unfortunately, my DRI devel system is sidelined by a hardware problem.  In the mean time, here's a patch that adds some debug output.  Could you try that and post or send me the output?  It might give me a better idea of what's going on.  Actually, you might need to change the _mesa_printf() calls to ErrorF().  I can't remember if the former works inside the X server.

diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c
index cd4f594..2980555 100644
--- a/src/mesa/main/framebuffer.c
+++ b/src/mesa/main/framebuffer.c
@@ -264,6 +264,8 @@ _mesa_reference_framebuffer(struct gl_framebuffer **ptr,
    assert(fb);
    _glthread_LOCK_MUTEX(fb->Mutex);
    fb->RefCount++;
+   _mesa_printf("_mesa_reference_framebuffer %p, count = %d\n",
+                (void*) fb, fb->RefCount);
    _glthread_UNLOCK_MUTEX(fb->Mutex);
    *ptr = fb;
 }
@@ -285,6 +287,8 @@ _mesa_unreference_framebuffer(struct gl_framebuffer **fb)
       _glthread_LOCK_MUTEX((*fb)->Mutex);
       ASSERT((*fb)->RefCount > 0);
       (*fb)->RefCount--;
+      _mesa_printf("_mesa_unreference_framebuffer %p, count = %d\n",
+                   (void*) *fb, (*fb)->RefCount);
       deleteFlag = ((*fb)->RefCount == 0);
       _glthread_UNLOCK_MUTEX((*fb)->Mutex);
       
Comment 45 Matthias Hopf 2007-03-27 07:26:46 UTC
I just did this with breakpoints in the debugger:

unref() is called one time more often than ref(), and ref() starts with a structure with count already set to 1. After the last unref count is 0, which looks good. Also, the ref'ed/unref'ed framebuffers stay the same, except for the first one.

_ref   000000000111ec50 count= 2
_ref   000000000111ec50 count= 3
_ref   000000000111ec50 count= 4
_ref   000000000111ec50 count= 5
_unref 000000000111ec50 count= 4
_unref 000000000111ec50 count= 3
_unref 000000000111ec50 count= 2
_unref 000000000111ec50 count= 1
_unref 000000000111ec50 count= 0
_ref   00000000013ced10 count= 2
_ref   00000000013ced10 count= 3
_ref   00000000013ced10 count= 4
_ref   00000000013ced10 count= 5
_unref 00000000013ced10 count= 4
_unref 00000000013ced10 count= 3
_unref 00000000013ced10 count= 2
_unref 00000000013ced10 count= 1
_unref 00000000013ced10 count= 0
_ref   00000000013ced10 count= 2
_ref   00000000013ced10 count= 3
_ref   00000000013ced10 count= 4
_ref   00000000013ced10 count= 5
_unref 00000000013ced10 count= 4
_unref 00000000013ced10 count= 3
_unref 00000000013ced10 count= 2
_unref 00000000013ced10 count= 1
_unref 00000000013ced10 count= 0
_ref   00000000013ced10 count= 2
_ref   00000000013ced10 count= 3
_ref   00000000013ced10 count= 4
_ref   00000000013ced10 count= 5
_unref 00000000013ced10 count= 4
_unref 00000000013ced10 count= 3
_unref 00000000013ced10 count= 2
_unref 00000000013ced10 count= 1
_unref 00000000013ced10 count= 0
[...]

This repeats. After loop #6 (of the test case) the test gets GL_OUT_OF_MEMORY and drawing routines return fast. After approx. loop #40 the fb changes to 00000000013cf920, which holds for 6 runs and changes one last time at #46. When these changes occur seems to depend on other circumstances and is probably irrelevant.


I traced the fb deletion code a bit (xmesa_delete_framebuffer):
b->frontxrb (0x1150750)->RefCount is 2 - so it isn't deleted. This is the same on concluding xmesa_delete_framebuffer() calls, with different frontxrb each time.
Comment 46 Brian Paul 2007-03-27 08:06:43 UTC
Matthias, I found and fixed a reference counting bug over the weekend. But, it was at the renderbuffer level, not framebuffer level.  So I'm not sure if that change will solve this problem.

Are you using the latest Mesa code (or at least post-Sunday)?
Comment 47 Matthias Hopf 2007-03-27 08:17:39 UTC
I'll update and check that tomorrow. I was using the same code basis as for the tests before.
Comment 48 Matthias Hopf 2007-04-04 09:16:36 UTC
Tomorrow is a relative statement. :-]

Current git seems to work fine, due to 42aaa548a1020be5d40b3dce9448d8004b1ef947.
Also, speed has substantially increased.

Thanks Brian, I consider this bug closed.
Comment 49 Brian Paul 2007-04-04 11:48:46 UTC
Great!  Thanks for helping with debugging, Matthias.
Comment 50 Michel Dänzer 2007-07-06 08:10:00 UTC
*** Bug 11485 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.