20704 – memory leak: Keep resizing glxgears window with compiz will make X hang

Bug 20704 - memory leak: Keep resizing glxgears window with compiz will make X hang

Summary: memory leak: Keep resizing glxgears window with compiz will make X hang

Status:	VERIFIED FIXED

Alias:	None

Product:	xorg
Classification:	Unclassified
Component:	Server/General (show other bugs)
Version:	git
Hardware:	Other Linux (All)

Importance:	high major
Assignee:	Shuang He
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-03-16 22:04 UTC by Shuang He
Modified:	2009-06-21 19:29 UTC (History)
CC List:	11 users (show)

See Also:
i915 platform:
i915 features:

Attachments
xorg log (23.04 KB, text/plain) 2009-03-16 22:07 UTC, Shuang He	no flags	Details
dmesg after X hang (122.50 KB, text/plain) 2009-03-16 22:10 UTC, Shuang He	no flags	Details
don't clear pDraw until after unref (348 bytes, patch) 2009-03-24 15:44 UTC, Jesse Barnes	no flags	Details \| Splinter Review
fixup GLX drawable management (2.13 KB, patch) 2009-03-26 15:39 UTC, Jesse Barnes	no flags	Details \| Splinter Review
leak fix (696 bytes, patch) 2009-03-27 17:43 UTC, Jesse Barnes	no flags	Details \| Splinter Review
full fix (10.72 KB, patch) 2009-04-09 10:38 UTC, Kristian Høgsberg	no flags	Details \| Splinter Review
Buffers created for fb should be released when destroy drawable (1.59 KB, patch) 2009-06-01 00:20 UTC, Shuang He	no flags	Details \| Splinter Review
Buffers created for fb should be released when destroy drawable (5.67 KB, patch) 2009-06-01 00:21 UTC, Shuang He	no flags	Details \| Splinter Review
glXReleaseTexImageEXT should release reference to storage for the pixmap (5.67 KB, patch) 2009-06-01 00:25 UTC, Shuang He	no flags	Details \| Splinter Review
Show Obsolete (4) View All

Description Shuang He 2009-03-16 22:04:11 UTC

System Environment:
--------------------------
Libdrm:		(master)2e2e8575b1ed4703653a72ac2b60b75316c388d7
Mesa:	      (mesa_7_4_branch)119360cccdd49475eed67dde6344bf9f9904bc1b		
Xserver:	(server-1.6-branch)60c161545af80eb78eb790a05bde79409dfdf16e
Xf86_video_intel:(2.7)490cb578aef761e3fdd0a559bec36cdab96e6b2a
Kernel:         (for-airlied)dc529a4fe1ae4667c819437a94185e8581e1e680


Bug detailed description:
-------------------------
with compiz enabled, keeping resizing glxgears window would make used memory keep increasing (reported from 'top'). At some point, system will start to use swap memory, and just a bit moment, X will stop responding, and complains about:

Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664
Execbuffer fails to pin. Estimate: 8470528. Actual: 10534912. Available: 240881664

reproduce steps:
------------------------
1. startx
2. enable compiz
3. run glxgears
4. keep resizing glxgears

Comment 1 Shuang He 2009-03-16 22:07:19 UTC

Created attachment 23950 [details]
xorg log

Comment 2 Shuang He 2009-03-16 22:10:14 UTC

Created attachment 23951 [details]
dmesg after X hang

Comment 3 Gordon Jin 2009-03-16 23:15:09 UTC

This is for moblin bug 1218.

Comment 4 Shuang He 2009-03-18 18:02:09 UTC

As what I heard from Moblin, Moblin is experiencing more serious memory leak (they got this issue just for resizing glxgears for a few minutes, though I got it for about half an hour), and is impacting them a lot. 
So please take this one as highest priority.

Comment 5 Jesse Barnes 2009-03-24 13:53:15 UTC

Looks like the DRI2 buffers aren't getting freed.  At resize time we get several calls:
indirect create drawable                                                        
DRI2CreateDrawable: new drawable, size 328x81                                   
DRI2GetBuffers, buffers = (nil), size 328x81, count 0                           
indirect drawable destroy 308x86                                                
indirect drawable destroy 300x300                                               
indirect create drawable                                                        
DRI2CreateDrawable: new drawable, size 622x498                                  
DRI2GetBuffers, buffers = (nil), size 622x498, count 0                          
indirect create drawable                                                        
DRI2CreateDrawable: new drawable, size 650x81                                   
indirect drawable destroy 328x81                                                
DRI2GetBuffers, buffers = (nil), size 650x81, count 0                           

But the __glXDRIdrawableDestroy doesn't end up calling the DRI2 destroy function because pDraw is NULL (seems like it shouldn't be).  I'm tracing it more now to see if I can find the root cause.

Comment 6 Jesse Barnes 2009-03-24 15:44:06 UTC

Created attachment 24216 [details] [review]
don't clear pDraw until after unref

Can you give this server patch a try?  I'm not too familiar with the GLX internals, but it looks like we're clearing pDraw of the GLX drawable too soon, which prevents the DRI2 destroy drawable routine from actually freeing the associated DRI2 buffers...  Seems to do the right thing in my light testing, but I didn't do a 30 min test like you did. :)

Comment 7 Shuang He 2009-03-24 20:00:16 UTC

(In reply to comment #6)
> Created an attachment (id=24216) [details]
> don't clear pDraw until after unref
> 
> Can you give this server patch a try?  I'm not too familiar with the GLX
> internals, but it looks like we're clearing pDraw of the GLX drawable too soon,
> which prevents the DRI2 destroy drawable routine from actually freeing the
> associated DRI2 buffers...  Seems to do the right thing in my light testing,
> but I didn't do a 30 min test like you did. :)
> 

This patch works for me. Thanks Jesse

Comment 8 Shuang He 2009-03-24 23:17:42 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > Created an attachment (id=24216) [details] [details]
> > don't clear pDraw until after unref
> > 
> > Can you give this server patch a try?  I'm not too familiar with the GLX
> > internals, but it looks like we're clearing pDraw of the GLX drawable too soon,
> > which prevents the DRI2 destroy drawable routine from actually freeing the
> > associated DRI2 buffers...  Seems to do the right thing in my light testing,
> > but I didn't do a 30 min test like you did. :)
> > 
> This patch works for me. Thanks Jesse

Oh, this patch introduce new issue. just resizing it a bit, may crash X
Here's the backtrace:
(gdb) bt
#0  0xb7fd5424 in __kernel_vsyscall ()
#1  0x03155660 in raise () from /lib/libc.so.6
#2  0x03157028 in abort () from /lib/libc.so.6
#3  0x031925bd in __libc_message () from /lib/libc.so.6
#4  0x031987e4 in malloc_printerr () from /lib/libc.so.6
#5  0x0319c441 in _int_realloc () from /lib/libc.so.6
#6  0x0319d176 in realloc () from /lib/libc.so.6
#7  0x08131002 in Xrealloc (ptr=0x6, amount=0) at utils.c:1133
#8  0x0806d10b in dixAllocatePrivate (privates=0x91487c8, key=0xb7e90a3c)
    at privates.c:129
#9  0x0806d1cc in dixSetPrivate (privates=0x91487c8, key=0xb7e90a3c, val=0x0)
    at privates.c:193
#10 0xb7e8eca1 in DRI2DestroyDrawable (pDraw=0x91487b0) at dri2.c:218
#11 0xb7eee668 in __glXDRIdrawableDestroy (drawable=0x9205ff0) at glxdri2.c:108
#12 0xb7ee49bb in __glXUnrefDrawable (glxPriv=0x9205ff0) at glxutil.c:58
#13 0xb7ee3183 in DrawableGone (glxPriv=0x9205ff0, xid=12583326)
    at glxext.c:131
#14 0x0806efdc in FreeResource (id=12583326, skipDeleteFuncType=0)
    at resource.c:561
#15 0xb7edffa6 in DoDestroyDrawable (cl=<value optimized out>,
    glxdrawable=12583326, type=1) at glxcmds.c:1225
#16 0xb7ee340a in __glXDispatch (client=0x8d79db8) at glxext.c:523
#17 0x080874cf in Dispatch () at dispatch.c:437
---Type <return> to continue, or q <return> to quit---
#18 0x0806c69d in main (argc=2, argv=0xbf9d2754, envp=Cannot access memory at address 0xbde
) at main.c:397

Comment 9 Jesse Barnes 2009-03-26 15:39:06 UTC

Created attachment 24296 [details] [review]
fixup GLX drawable management

I think this is a more complete fix; I'm still waiting on review from some X people.

Comment 10 Jesse Barnes 2009-03-27 17:43:30 UTC

Created attachment 24328 [details] [review]
leak fix

Ok hope this is the last one.  Please test.

Comment 11 Sergejs Pisarenko 2009-03-28 08:41:02 UTC

The last leak fix does not work for me...

Backtrace:
0: /usr/bin/X(xorg_backtrace+0x3c) [0x81347dc]
1: /usr/bin/X(xf86SigHandler+0x52) [0x80d4fe2]
2: [0xb8071400]
3: /usr/bin/X(dixSetPrivate+0x5c) [0x8070a8c]
4: /usr/lib/xorg/modules/extensions//libdri2.so(DRI2DestroyDrawable+0x9f) [0xb7$
5: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a65fd8]
6: /usr/lib/xorg/modules/extensions//libglx.so(__glXUnrefDrawable+0x47) [0xb7a5$
7: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a5a630]
8: /usr/bin/X(FreeResource+0x114) [0x8072a94]
9: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a57404]
10: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a5a8c2]
11: /usr/bin/X(Dispatch+0x34f) [0x808b53f]
12: /usr/bin/X(main+0x3bd) [0x806ff8d]
13: /lib/libc.so.6(__libc_start_main+0xe1) [0xb7b9b621]
14: /usr/bin/X [0x806f411]

Fatal server error:
Caught signal 11.  Server aborting

Comment 12 Shuang He 2009-03-29 18:09:42 UTC

(In reply to comment #10)
> Created an attachment (id=24328) [details]
> leak fix
> 
> Ok hope this is the last one.  Please test.
> 

Get same backtrace as in Comment  #8

Comment 13 Shuang He 2009-03-29 20:24:14 UTC

(In reply to comment #12)
> (In reply to comment #10)
> > Created an attachment (id=24328) [details] [details]
> > leak fix
> > 
> > Ok hope this is the last one.  Please test.
> > 
> 
> Get same backtrace as in Comment  #8
> 

Just debug a bit, check out this series of calls in DRI2DestroyDrawable when X crashed:
in (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount);
  Xfree: free(0x9eef330)
  Xfree: free(0x9eeef20)
  Xfree: free(0x9efdde0)
  Xfree: free(0x9efce08)
  Xfree: free(0x9eee8b0)
  Xrealloc: ptr = 0x9efaf20
  Xrealloc: amount = 384
  Xfree: free(0x9efcd18)
  Xfree: free(0x9ef8468)
  Xrealloc: ptr = 0x9efa278
  Xrealloc: amount = 384
  Xfree: free(0x9efcd18)
  Xfree: free(0x9ef9808)
  Xfree: free(0x9eeef38)
  Xfree: free(0x9efa648)
  Xfree: free(0x9efd788)
in dixSetPrivate(&pPixmap->devPrivates, dri2PixmapPrivateKey, NULL);
  Xrealloc: ptr = 0x9efce08
  Xrealloc: amount = 512

So dixSetPrivate is trying to realloc memory at 0x9efce08, which is alreay freed  in DetroyBuffers. So maybe we should also do this:
diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 0f2e24b..dddcfdc 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -204,9 +204,6 @@ DRI2DestroyDrawable(DrawablePtr pDraw)
     if (pPriv->refCount > 0)
        return;

-    (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount);
-    xfree(pPriv);
-
     if (pDraw->type == DRAWABLE_WINDOW)
     {
        pWin = (WindowPtr) pDraw;
@@ -217,6 +214,9 @@ DRI2DestroyDrawable(DrawablePtr pDraw)
        pPixmap = (PixmapPtr) pDraw;
        dixSetPrivate(&pPixmap->devPrivates, dri2PixmapPrivateKey, NULL);
     }
+
+    (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount);
+    xfree(pPriv);
 }

 Bool

Comment 14 Mingming 2009-03-30 08:54:36 UTC

I applied the patches by Jesse Barnes, and X crashed after some time.
Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer works. (I'm not sure whether it is caused by the patch)

The number of "drm mm object" still increases all the time.

And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994.
And I got my /proc/dri/0/gem_objects:

18534 objects
1451909120 object bytes
4 pinned
12681216 pin bytes
247177216 gtt bytes
260313088 gtt total


Intel GM965
xf86-video-intel: 2.6.99
libdrm: 2.4.5
kernel: 2.6.29 with KMS enabled
mesa: 7.4rc1

Comment 15 Shuang He 2009-03-30 18:03:03 UTC

(In reply to comment #14)
> I applied the patches by Jesse Barnes, and X crashed after some time.
> Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer
> works. (I'm not sure whether it is caused by the patch)
> 
> The number of "drm mm object" still increases all the time.
> 
> And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994.
> And I got my /proc/dri/0/gem_objects:
> 
> 18534 objects
> 1451909120 object bytes
> 4 pinned
> 12681216 pin bytes
> 247177216 gtt bytes
> 260313088 gtt total
> 
> 
> Intel GM965
> xf86-video-intel: 2.6.99
> libdrm: 2.4.5
> kernel: 2.6.29 with KMS enabled
> mesa: 7.4rc1
> 

Jesse's patch in Comment  #10 and mine in Comment #13 should be applied at the same time.

Comment 16 Jesse Barnes 2009-03-30 18:30:08 UTC

This is an X server bug.

Comment 17 Mingming 2009-03-31 05:32:44 UTC

(In reply to comment #15)
> (In reply to comment #14)
> > I applied the patches by Jesse Barnes, and X crashed after some time.
> > Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer
> > works. (I'm not sure whether it is caused by the patch)
> > 
> > The number of "drm mm object" still increases all the time.
> > 
> > And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994.
> > And I got my /proc/dri/0/gem_objects:
> > 
> > 18534 objects
> > 1451909120 object bytes
> > 4 pinned
> > 12681216 pin bytes
> > 247177216 gtt bytes
> > 260313088 gtt total
> > 
> > 
> > Intel GM965
> > xf86-video-intel: 2.6.99
> > libdrm: 2.4.5
> > kernel: 2.6.29 with KMS enabled
> > mesa: 7.4rc1
> > 
> 
> Jesse's patch in Comment  #10 and mine in Comment #13 should be applied at the
> same time.
> 

ok, I applied both patches. Now after 2h50m, I got:

sudo lsof | grep "drm mm object" | wc -l
5248

and /proc/dri/0/gem_objects:
13116 objects
1109676032 object bytes
4 pinned
12681216 pin bytes
186478592 gtt bytes
260313088 gtt total

Comment 18 Sergejs Pisarenko 2009-04-02 11:19:07 UTC

(In reply to comment #17)
> (In reply to comment #15)
> > (In reply to comment #14)
> > > I applied the patches by Jesse Barnes, and X crashed after some time.
> > > Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer
> > > works. (I'm not sure whether it is caused by the patch)
> > > 
> > > The number of "drm mm object" still increases all the time.
> > > 
> > > And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994.
> > > And I got my /proc/dri/0/gem_objects:
> > > 
> > > 18534 objects
> > > 1451909120 object bytes
> > > 4 pinned
> > > 12681216 pin bytes
> > > 247177216 gtt bytes
> > > 260313088 gtt total
> > > 
> > > 
> > > Intel GM965
> > > xf86-video-intel: 2.6.99
> > > libdrm: 2.4.5
> > > kernel: 2.6.29 with KMS enabled
> > > mesa: 7.4rc1
> > > 
> > 
> > Jesse's patch in Comment  #10 and mine in Comment #13 should be applied at the
> > same time.
> > 
> 
> ok, I applied both patches. Now after 2h50m, I got:
> 
> sudo lsof | grep "drm mm object" | wc -l
> 5248
> 
> and /proc/dri/0/gem_objects:
> 13116 objects
> 1109676032 object bytes
> 4 pinned
> 12681216 pin bytes
> 186478592 gtt bytes
> 260313088 gtt total
> 
With these patches it is better indeed but, as mentioned, the memory usage is still abnormally high for X.

Comment 19 Jesse Barnes 2009-04-06 14:58:24 UTC

FYI, latest fix is from Michel (see below), hopefully it will be pushed soon.


diff --git a/glx/glxext.c b/glx/glxext.c
index c882372..e74d00e 100644
--- a/glx/glxext.c
+++ b/glx/glxext.c
@@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id)
 static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid)
 {
     ScreenPtr pScreen = glxPriv->pDraw->pScreen;
+    PixmapPtr pPixmap = NULL;
+    int refcount;
 
     switch (glxPriv->type) {
 	case GLX_DRAWABLE_PIXMAP:
 	case GLX_DRAWABLE_PBUFFER:
-	    (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw);
+	    pPixmap = (PixmapPtr) glxPriv->pDraw;
 	    break;
     }
 
-    glxPriv->pDraw = NULL;
-    glxPriv->drawId = 0;
+    refcount = glxPriv->refCount;
     __glXUnrefDrawable(glxPriv);
+    if (refcount > 1) {
+	glxPriv->pDraw = NULL;
+	glxPriv->drawId = 0;
+    }
+    
+    if (pPixmap)
+	(*pScreen->DestroyPixmap)(pPixmap);
 
     return True;
 }

Comment 20 Sergejs Pisarenko 2009-04-07 03:01:41 UTC

(In reply to comment #19)
> FYI, latest fix is from Michel (see below), hopefully it will be pushed soon.
> 
> 
> diff --git a/glx/glxext.c b/glx/glxext.c
> index c882372..e74d00e 100644
> --- a/glx/glxext.c
> +++ b/glx/glxext.c
> @@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id)
>  static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid)
>  {
>      ScreenPtr pScreen = glxPriv->pDraw->pScreen;
> +    PixmapPtr pPixmap = NULL;
> +    int refcount;
> 
>      switch (glxPriv->type) {
>         case GLX_DRAWABLE_PIXMAP:
>         case GLX_DRAWABLE_PBUFFER:
> -           (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw);
> +           pPixmap = (PixmapPtr) glxPriv->pDraw;
>             break;
>      }
> 
> -    glxPriv->pDraw = NULL;
> -    glxPriv->drawId = 0;
> +    refcount = glxPriv->refCount;
>      __glXUnrefDrawable(glxPriv);
> +    if (refcount > 1) {
> +       glxPriv->pDraw = NULL;
> +       glxPriv->drawId = 0;
> +    }
> +    
> +    if (pPixmap)
> +       (*pScreen->DestroyPixmap)(pPixmap);
> 
>      return True;
>  }
> 

Should this patch be applied together with other patches? Patching fails on Xorg 1.6.

Comment 21 Mingming 2009-04-08 05:04:36 UTC

(In reply to comment #19)
> FYI, latest fix is from Michel (see below), hopefully it will be pushed soon.
> 
> 
> diff --git a/glx/glxext.c b/glx/glxext.c
> index c882372..e74d00e 100644
> --- a/glx/glxext.c
> +++ b/glx/glxext.c
> @@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id)
>  static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid)
>  {
>      ScreenPtr pScreen = glxPriv->pDraw->pScreen;
> +    PixmapPtr pPixmap = NULL;
> +    int refcount;
> 
>      switch (glxPriv->type) {
>         case GLX_DRAWABLE_PIXMAP:
>         case GLX_DRAWABLE_PBUFFER:
> -           (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw);
> +           pPixmap = (PixmapPtr) glxPriv->pDraw;
>             break;
>      }
> 
> -    glxPriv->pDraw = NULL;
> -    glxPriv->drawId = 0;
> +    refcount = glxPriv->refCount;
>      __glXUnrefDrawable(glxPriv);
> +    if (refcount > 1) {
> +       glxPriv->pDraw = NULL;
> +       glxPriv->drawId = 0;
> +    }
> +    
> +    if (pPixmap)
> +       (*pScreen->DestroyPixmap)(pPixmap);
> 
>      return True;
>  }
> 

This is what I get with this patch. After running 2h55m:

sudo lsof | grep "drm mm object" | wc -l
3695

cat /proc/dri/0/gem_objects 
13819 objects
1520975872 object bytes
4 pinned
13828096 pin bytes
244764672 gtt bytes
260313088 gtt total

Comment 22 Kristian Høgsberg 2009-04-09 10:38:14 UTC

Created attachment 24684 [details] [review]
full fix

This patch should fix the leak in all the different drawable destroy combinations.  I'll push it as soon as I get a bit of positive feedback.

Comment 23 Colin Guthrie 2009-04-09 11:38:08 UTC

Thanks Kristian!

I don't suppose you (or anyone else) has a version of this fix that would apply on the 1.6 branch?

Comment 24 Shuang He 2009-04-09 18:42:45 UTC

(In reply to comment #22)
> Created an attachment (id=24684) [details]
> full fix
> 
> This patch should fix the leak in all the different drawable destroy
> combinations.  I'll push it as soon as I get a bit of positive feedback.
> 

I just play with it for half an hour, it works well, and didn't see the memory leak.
Thanks, Kristian.

Comment 25 Li Peng 2009-04-09 19:06:00 UTC

Kristian, Can you also push this fix to 1.6 branch ?

Comment 26 Mingming 2009-04-10 05:54:43 UTC

(In reply to comment #22)
> Created an attachment (id=24684) [details]
> full fix
> 
> This patch should fix the leak in all the different drawable destroy
> combinations.  I'll push it as soon as I get a bit of positive feedback.
> 

No luck for me. Mem usage still continuously increases.
After 1h30

$ cat /proc/dri/0/gem_objects 
11499 objects
627249152 object bytes
4 pinned
13828096 pin bytes
120881152 gtt bytes
260313088 gtt total

$ sudo lsof | grep "drm mm object" | wc -l
4057

Is that normal?

Comment 27 Yang Zhe 2009-04-11 07:14:50 UTC

I applied the patch with the xorg 1.6.0 source (and did some manually modification to let the patch work) and rebuiled it. But I think that there is only some improvement. By keep maximize/restore a window, the memory usage are raising, and afterall the memory and swap are used up

Comment 28 Yang Zhe 2009-04-11 07:19:11 UTC

Oh, I'm sorry, after some minutes, the memory usage come backs to normal...

Will this work exactly this?

Comment 29 Shuang He 2009-04-11 07:24:30 UTC

(In reply to comment #28)
> Oh, I'm sorry, after some minutes, the memory usage come backs to normal...
> 
> Will this work exactly this?
> 

It shouldn't use up all system memory, if you're only keeping maximize/restore a window.  Kristian's patch is against xserver master branch, could you give it a try?

Comment 30 Jesse Barnes 2009-04-13 10:39:29 UTC

Kristian pushed the fix.

commit 7b6400a1b8d2f228fcbedf17c30a7e3924e4dd2a
Author: Kristian HÃ¸gsberg <krh@redhat.com>
Date:   Thu Apr 9 13:16:37 2009 -0400

    glx: Fix drawable private leak on destroy

Comment 31 Shuang He 2009-04-16 01:41:31 UTC

(In reply to comment #28)
> Oh, I'm sorry, after some minutes, the memory usage come backs to normal...
> 
> Will this work exactly this?
> 


I think I see exactly what you met now. 
In all, it seems not free system memory at the right time. So if we keep resizing the window for a few minutes, it will use up all system memory, and finally GPU hang. So there might be some issue with BO cache policy

And here's what I saw:
with KMS/UXA/DRI2/compiz. a few resizing will make memory usage grow rapidly
it will reach to use 1092860k, cat /proc/dri/0/gem_objects 
2732 objects
552259584 object bytes
4 pinned
16977920 pin bytes
189448192 gtt bytes
234885120 gtt total

then if I don't operate for a few seconds, it seems start to free some memories.
so it use 859872k system memory, cat /proc/dri/0/gem_objects 
3011 objects
276971520 object bytes
4 pinned
16977920 pin bytes
181391360 gtt bytes
234885120 gtt total


and if I start resizing the window again, it will reach to use 1440524k system memory, and cat /proc/dri/0/gem_objects 
3394 objects
832200704 object bytes
4 pinned
16977920 pin bytes
179752960 gtt bytes
234885120 gtt total

and then if I keep resizing the window for a few minutes, it will use all system  memory, then GPU hangs.

Comment 32 Shuang He 2009-04-16 04:07:29 UTC

Ok, I have caught following leak on aspire one with KMS/UXA/DRI2/compiz, it just means 326 bo are lost in composite extension (the '0 bytes' is some trick I used with valgrind to catch such memory leak):

==14839== 0 bytes in 326 blocks are definitely lost in loss record 1 of 124
==14839==    at 0x55AA62F: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:4                                                                             28)
==14839==    by 0x55A61C3: drm_intel_bo_alloc_for_render (intel_bufmgr.c:58)
==14839==    by 0x55608B6: i830_uxa_create_pixmap (i830_exa.c:976)
==14839==    by 0x8138E3B: compNewPixmap (compalloc.c:478)
==14839==    by 0x81392D4: compAllocPixmap (compalloc.c:556)
==14839==    by 0x8138939: compCheckRedirect (compwindow.c:161)
==14839==    by 0x8138A2F: compRealizeWindow (compwindow.c:242)
==14839==    by 0x806F9EB: RealizeTree (window.c:2605)
==14839==    by 0x807179A: MapWindow (window.c:2698)
==14839==    by 0x8139A9D: compRedirectWindow (compalloc.c:179)
==14839==    by 0x8139D6C: compRedirectSubwindows (compalloc.c:320)
==14839==    by 0x81368C1: ProcCompositeRedirectSubwindows (compext.c:172)

Comment 33 Shuang He 2009-04-16 20:01:53 UTC

To summarize:
There seems two symptoms:
1. systems memories used by graphics driver will keep growing for a few times of resize operation, then drops dramatically, then grow again, and drops again ... If resize many times in very short time, it will consume all system memory and get X not resposible.
2. serious memory leak with composite, which will make graphics driver comsuming all system memory.
on my Q35, I see neither of them
on G45 and GM45, I see <1>, disable buffer reuse doesn't help here. it's desribed in comment #31
on aspire one, I see <2>, disable buffer reuse doesn't help here. it's desribed in comment #32

Comment 34 Shuang He 2009-04-23 07:45:55 UTC

Seems pixmaps get refcnt incremented during I830DRI2CreateBuffers, but not dereferencing it corespondingly. Haven't tried this out, hope it would help:

diff --git a/src/i830_dri.c b/src/i830_dri.c
index 6a32492..633895b 100644
--- a/src/i830_dri.c
+++ b/src/i830_dri.c
@@ -1618,11 +1618,20 @@ I830DRI2DestroyBuffers(DrawablePtr pDraw, DRI2BufferPtr buffers, int count)
 {
     ScreenPtr pScreen = pDraw->pScreen;
     I830DRI2BufferPrivatePtr private;
+    PixmapPtr pDepthPixmap = NULL;
     int i;

     for (i = 0; i < count; i++)
     {
        private = buffers[i].driverPrivate;
+       if (buffers[i].attachment == DRI2BufferFrontLeft ||
+           buffers[i].attachment == DRI2BufferStencil && pDepthPixmap) {
+           private->pPixmap->refcnt--;
+       }
+
+       if (buffers[i].attachment == DRI2BufferDepth)
+           pDepthPixmap = private->pPixmap;
+
        (*pScreen->DestroyPixmap)(private->pPixmap);
     }

Comment 35 Mingming 2009-04-23 09:26:36 UTC

This patch does not work for me. Memory usage still increases.
And compiz leads to an X crash.

On Thu, Apr 23, 2009 at 4:45 PM, <bugzilla-daemon@freedesktop.org> wrote:

> http://bugs.freedesktop.org/show_bug.cgi?id=20704
>
>
>
>
>
> --- Comment #34 from Shuang He <shuang.he@intel.com>  2009-04-23 07:45:55
> PST ---
> Seems pixmaps get refcnt incremented during I830DRI2CreateBuffers, but not
> dereferencing it corespondingly. Haven't tried this out, hope it would
> help:
>
> diff --git a/src/i830_dri.c b/src/i830_dri.c
> index 6a32492..633895b 100644
> --- a/src/i830_dri.c
> +++ b/src/i830_dri.c
> @@ -1618,11 +1618,20 @@ I830DRI2DestroyBuffers(DrawablePtr pDraw,
> DRI2BufferPtr
> buffers, int count)
>  {
>     ScreenPtr pScreen = pDraw->pScreen;
>     I830DRI2BufferPrivatePtr private;
> +    PixmapPtr pDepthPixmap = NULL;
>     int i;
>
>     for (i = 0; i < count; i++)
>     {
>        private = buffers[i].driverPrivate;
> +       if (buffers[i].attachment == DRI2BufferFrontLeft ||
> +           buffers[i].attachment == DRI2BufferStencil && pDepthPixmap) {
> +           private->pPixmap->refcnt--;
> +       }
> +
> +       if (buffers[i].attachment == DRI2BufferDepth)
> +           pDepthPixmap = private->pPixmap;
> +
>        (*pScreen->DestroyPixmap)(private->pPixmap);
>      }
>
>
> --
> Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Comment 36 Shuang He 2009-04-23 20:56:40 UTC

I've tried following configuration of codes:
Kernel_version:		2.6.29.1
Libdrm:		(master)412d370b9ae4b2882691863a1c5e13a507574e92
Mesa:		(mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d
Xserver:	(server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617
Xf86_video_intel:		(2.7)296a986e5258e2fd13ec494071b7063bd639cd68
Kernel:         (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792

On aspire one, before start X, 110+MB memory is used, after start desktop with compiz, 380+MB memory is used, and after resizing windows for 10 minutes, 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used.

On GM45, with same codes, still see issue 1 in comment #33

Comment 37 Mingming 2009-04-24 02:46:25 UTC

Shuang He,

With your patch, when I start compiz, X crashes:

Backtrace:
0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48]
1: /usr/X11R6/bin/X [0x431e3d]
2: /lib/libpthread.so.0 [0x7f6aa986e080]
3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840]
4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48]
5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e)
[0x7f6aa609827e]
6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd]
7: /usr/X11R6/bin/X [0x48e8e4]
8: /usr/X11R6/bin/X [0x42951d]
9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6]
10: /usr/X11R6/bin/X [0x428969]
Segmentation fault at address 0x20

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting

My configuration:
kernel:  2.6.30-rc3
drm:  (master)412d370b9ae4b2882691863a1c5e13a507574e92
mesa:  (master)ff71587b27beaf288d535e14c75e58425d7efc7a
xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464
xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a


On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org> wrote:

> http://bugs.freedesktop.org/show_bug.cgi?id=20704
>
>
>
>
>
> --- Comment #36 from Shuang He <shuang.he@intel.com>  2009-04-23 20:56:40
> PST ---
> I've tried following configuration of codes:
> Kernel_version:         2.6.29.1
> Libdrm:         (master)412d370b9ae4b2882691863a1c5e13a507574e92
> Mesa:           (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d
> Xserver:        (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617
> Xf86_video_intel:
> (2.7)296a986e5258e2fd13ec494071b7063bd639cd68
> Kernel:         (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792
>
> On aspire one, before start X, 110+MB memory is used, after start desktop
> with
> compiz, 380+MB memory is used, and after resizing windows for 10 minutes,
> 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used.
>
> On GM45, with same codes, still see issue 1 in comment #33
>
>
> --
> Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Comment 38 Shuang He 2009-04-24 08:33:14 UTC

Oh, sorry for not making this clear.
What I mean is, with that configurations I list, without any other patch, I don't see serious leak on aspire one now. Could you help try that?


(In reply to comment #37)
> Shuang He,
> 
> With your patch, when I start compiz, X crashes:
> 
> Backtrace:
> 0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48]
> 1: /usr/X11R6/bin/X [0x431e3d]
> 2: /lib/libpthread.so.0 [0x7f6aa986e080]
> 3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840]
> 4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48]
> 5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e)
> [0x7f6aa609827e]
> 6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd]
> 7: /usr/X11R6/bin/X [0x48e8e4]
> 8: /usr/X11R6/bin/X [0x42951d]
> 9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6]
> 10: /usr/X11R6/bin/X [0x428969]
> Segmentation fault at address 0x20
> 
> Fatal server error:
> Caught signal 11 (Segmentation fault). Server aborting
> 
> My configuration:
> kernel:  2.6.30-rc3
> drm:  (master)412d370b9ae4b2882691863a1c5e13a507574e92
> mesa:  (master)ff71587b27beaf288d535e14c75e58425d7efc7a
> xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464
> xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a
> 
> 
> On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org> wrote:
> 
> > http://bugs.freedesktop.org/show_bug.cgi?id=20704
> >
> >
> >
> >
> >
> > --- Comment #36 from Shuang He <shuang.he@intel.com>  2009-04-23 20:56:40
> > PST ---
> > I've tried following configuration of codes:
> > Kernel_version:         2.6.29.1
> > Libdrm:         (master)412d370b9ae4b2882691863a1c5e13a507574e92
> > Mesa:           (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d
> > Xserver:        (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617
> > Xf86_video_intel:
> > (2.7)296a986e5258e2fd13ec494071b7063bd639cd68
> > Kernel:         (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792
> >
> > On aspire one, before start X, 110+MB memory is used, after start desktop
> > with
> > compiz, 380+MB memory is used, and after resizing windows for 10 minutes,
> > 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used.
> >
> > On GM45, with same codes, still see issue 1 in comment #33
> >
> >
> > --
> > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> > ------- You are receiving this mail because: -------
> > You are on the CC list for the bug.
> >
>

Comment 39 Mingming 2009-04-24 11:50:55 UTC

Sorry. I'll try to make my point clear. :)

With your patch, I cannot start compiz. And without compiz, I can hardly
tell if this patch fixes the leakage, because the leakage is noticeable only
when compiz is enabled. Maybe I'm in another situation.

BTW, I notice that in the file /proc/dri/0/gem_objects, the number of
objects always increases, even if I close all the windows. Unless I restart
X, this number never decreases. Is that normal? Is it related to the bug
discussed here?
Thanks for your help.


On Fri, Apr 24, 2009 at 5:33 PM, <bugzilla-daemon@freedesktop.org> wrote:

> http://bugs.freedesktop.org/show_bug.cgi?id=20704
>
>
>
>
>
> --- Comment #38 from Shuang He <shuang.he@intel.com>  2009-04-24 08:33:14
> PST ---
> Oh, sorry for not making this clear.
> What I mean is, with that configurations I list, without any other patch, I
> don't see serious leak on aspire one now. Could you help try that?
>
>
> (In reply to comment #37)
> > Shuang He,
> >
> > With your patch, when I start compiz, X crashes:
> >
> > Backtrace:
> > 0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48]
> > 1: /usr/X11R6/bin/X [0x431e3d]
> > 2: /lib/libpthread.so.0 [0x7f6aa986e080]
> > 3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840]
> > 4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48]
> > 5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e)
> > [0x7f6aa609827e]
> > 6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd]
> > 7: /usr/X11R6/bin/X [0x48e8e4]
> > 8: /usr/X11R6/bin/X [0x42951d]
> > 9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6]
> > 10: /usr/X11R6/bin/X [0x428969]
> > Segmentation fault at address 0x20
> >
> > Fatal server error:
> > Caught signal 11 (Segmentation fault). Server aborting
> >
> > My configuration:
> > kernel:  2.6.30-rc3
> > drm:  (master)412d370b9ae4b2882691863a1c5e13a507574e92
> > mesa:  (master)ff71587b27beaf288d535e14c75e58425d7efc7a
> > xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464
> > xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a
> >
> >
> > On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org>
> wrote:
> >
> > > http://bugs.freedesktop.org/show_bug.cgi?id=20704
> > >
> > >
> > >
> > >
> > >
> > > --- Comment #36 from Shuang He <shuang.he@intel.com>  2009-04-23
> 20:56:40
> > > PST ---
> > > I've tried following configuration of codes:
> > > Kernel_version:         2.6.29.1
> > > Libdrm:         (master)412d370b9ae4b2882691863a1c5e13a507574e92
> > > Mesa:
> (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d
> > > Xserver:
>  (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617
> > > Xf86_video_intel:
> > > (2.7)296a986e5258e2fd13ec494071b7063bd639cd68
> > > Kernel:         (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792
> > >
> > > On aspire one, before start X, 110+MB memory is used, after start
> desktop
> > > with
> > > compiz, 380+MB memory is used, and after resizing windows for 10
> minutes,
> > > 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is
> used.
> > >
> > > On GM45, with same codes, still see issue 1 in comment #33
> > >
> > >
> > > --
> > > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> > > ------- You are receiving this mail because: -------
> > > You are on the CC list for the bug.
> > >
> >
>
>
> --
> Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Comment 40 Shuang He 2009-04-24 19:41:10 UTC

I already knew that my patch is not working. 
With that configurations I mentioned in comment #36, __without_any_patch__, (don't apply my patch, it's not fixing the problem), I don't see the leak any more. 
And yes, it is the bug discussed here.

Thanks
  --Shuang
(In reply to comment #39)
> Sorry. I'll try to make my point clear. :)
> With your patch, I cannot start compiz. And without compiz, I can hardly
> tell if this patch fixes the leakage, because the leakage is noticeable only
> when compiz is enabled. Maybe I'm in another situation.
> BTW, I notice that in the file /proc/dri/0/gem_objects, the number of
> objects always increases, even if I close all the windows. Unless I restart
> X, this number never decreases. Is that normal? Is it related to the bug
> discussed here?
> Thanks for your help.

Comment 41 Zheng Kui 2009-04-29 22:46:20 UTC

in moblin latest image 2009-04-30 integrated xserver-1.6.1 with Kristian HÃ¸gsberg's patch the issue could be reproduced
then I try the following upstream components, it seems system mem would be consumed up for a while when keeping resize gears window

Platform -- Netbook (eepc, 945GME)
OSD -- moblin-netbook-20090428 image
Kernel -- (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792
Libdrm -- (master)11b60973bca1bc9bbda44be4c695e22d28d8ca4a
Mesa -- (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d
Xserver -- (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617
Xf86_video_intel -- (2.7)296a986e5258e2fd13ec494071b7063bd639cd68

Comment 42 Guido Iodice 2009-05-21 18:27:27 UTC

I can confirm the bug on my hard/soft configuration:

card: i915 gm
os: Ubuntu 9.04 *
kernel: linux 2.6.30 rc6 vanilla
xorg driver version: 2.7.1 stable
libdrm-intel: 2.4.9
mesa: 7.4.1

* I have installed on Ubuntu Jaunty packages from xorg-update PPA repos and mesa from karmic repos: this not solves the issue.

Comment 43 Shuang He 2009-05-30 17:27:42 UTC

for symtom 1 desribed in commet #43(In reply to comment #33)
> To summarize:
> There seems two symptoms:
> 1. systems memories used by graphics driver will keep growing for a few times
> of resize operation, then drops dramatically, then grow again, and drops again
> ... If resize many times in very short time, it will consume all system memory
> and get X not resposible.
> 2. serious memory leak with composite, which will make graphics driver
> comsuming all system memory.
> on my Q35, I see neither of them
> on G45 and GM45, I see <1>, disable buffer reuse doesn't help here. it's
> desribed in comment #31
> on aspire one, I see <2>, disable buffer reuse doesn't help here. it's desribed
> in comment #32
> 

For the symptom  <1>, it seems it's the result of 965 state cache. 
I have tracked it with valgrind (with VALGRIND_PRINTF_BACKTRACE), following is one of the buffer object I checked, you can see buffer object 474 is allocated when a window is created, and this buffer object is deleted much later in brw_clear_cache

**6022** shuang 443 alloc: handle=474, size=256 KB
==6022==    at 0x5511171: VALGRIND_PRINTF_BACKTRACE (valgrind.h:3695)
==6022==    by 0x5511CF8: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:437)
==6022==    by 0x550D473: drm_intel_bo_alloc_for_render (intel_bufmgr.c:58)
==6022==    by 0x52978AE: intel_region_alloc (intel_regions.c:173)
==6022==    by 0x52968A1: intel_miptree_create (intel_mipmap_tree.c:122)
==6022==    by 0x52B761B: intelTexImage (intel_tex_image.c:132)
==6022==    by 0x52B80CD: intelTexImage2D (intel_tex_image.c:587)
==6022==    by 0x5370B19: _mesa_TexImage2D (teximage.c:2676)
==6022==    by 0x45A021D: (within /usr/lib/tmp/libclutter-glx-0.9.so.0.903.0)
==6022==    by 0x45A04E8: cogl_texture_new_from_data (in /usr/lib/tmp/libclutter-glx-0.9.so.0.903.0)
==6022==    by 0x80A8F2F: (within /usr/bin/metacity)
==6022==    by 0x80A9112: (within /usr/bin/metacity)

**6022** shuang 6130 delete: handle=474, size=256 KB
==6022==    at 0x5511171: VALGRIND_PRINTF_BACKTRACE (valgrind.h:3695)
==6022==    by 0x551131E: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:573)
==6022==    by 0x55112AD: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:582)
==6022==    by 0x55112AD: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:582)
==6022==    by 0x5511791: drm_intel_gem_bo_unreference (intel_bufmgr_gem.c:621)
==6022==    by 0x550D4B5: drm_intel_bo_unreference (intel_bufmgr.c:73)
==6022==    by 0x52D0B2A: brw_clear_cache (brw_state_cache.c:501)
==6022==    by 0x52D8C8C: brw_note_unlock (brw_vtbl.c:184)
==6022==    by 0x528C43B: UNLOCK_HARDWARE (intel_context.c:1078)
==6022==    by 0x52C58EA: brw_draw_prims (brw_draw.c:417)
==6022==    by 0x538E59B: vbo_exec_DrawRangeElements (vbo_exec_array.c:435)
==6022==    by 0x5382EB9: neutral_DrawRangeElements (vtxfmt_tmp.h:343)

If I reduced the limit of cached items, this symptom will disappear:

diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c b/src/mesa/drivers/dri/i965/brw_state_cache.c
index e40d7a0..0afb7af 100644
--- a/src/mesa/drivers/dri/i965/brw_state_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_state_cache.c
@@ -527,10 +527,10 @@ brw_state_cache_check_size(struct brw_context *brw)
    /* un-tuned guess.  We've got around 20 state objects for a total of around
     * 32k, so 1000 of them is around 1.5MB.
     */
-   if (brw->cache.n_items > 1000)
+   if (brw->cache.n_items > 100)
       brw_clear_cache(brw, &brw->cache);

-   if (brw->surface_cache.n_items > 1000)
+   if (brw->surface_cache.n_items > 100)
       brw_clear_cache(brw, &brw->surface_cache);
 }

Comment 44 Shuang He 2009-06-01 00:20:34 UTC

Created attachment 26326 [details] [review]
Buffers created for fb should be released when destroy drawable

Comment 45 Shuang He 2009-06-01 00:21:22 UTC

Created attachment 26327 [details] [review]
Buffers created for fb should be released when destroy drawable

Comment 46 Shuang He 2009-06-01 00:22:24 UTC

(In reply to comment #45)
> Created an attachment (id=26327) [details]
> Buffers created for fb should be released when destroy drawable
> 

copy-n－paste failure, should be:
glXReleaseTexImageEXT should release reference to storage for the pixmap

Comment 47 Shuang He 2009-06-01 00:25:57 UTC

Created attachment 26328 [details] [review]
glXReleaseTexImageEXT should release reference to storage for the pixmap

reupload

Comment 48 Shuang He 2009-06-01 00:33:18 UTC

Patches in comment #44 and comment #47 need to be applied to mesa at the same time. And there's also an issue in compiz, that according to GLX 1.4 spec, "however, GLXPixmaps created by call other than glXCreateGLXPixmap should not be passed to glXDestroyGLXPixmap", so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap, since their pixmaps are created by calling glXCreatePixmap.

Comment 49 Dark Shadow 2009-06-12 06:42:43 UTC

Applied patches 2 and 3, as 1 is already in git (correct me if I'm wrong).
Compiz is unstable and crashes when using the ring switcher plugin, and also when opening many many windows, so removed both patches again.

> so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap,
> since their pixmaps are created by calling
> glXCreatePixmap.

Corrected occurences in textures.c, seems to work ok but can't tell about any differences yet.


cat /proc/dri/0/gem_objects
1294 objects
153399296 object bytes
4 pinned
13770752 pin bytes
125480960 gtt bytes
260308992 gtt total

Objects increase quite a lot, but memory is not consumed that fast. Should the objects decrease after closing a window?

Using kernel-2.6.30.

Comment 50 Shuang He 2009-06-12 07:38:21 UTC

(In reply to comment #49)
> Applied patches 2 and 3, as 1 is already in git (correct me if I'm wrong).
> Compiz is unstable and crashes when using the ring switcher plugin, and also
> when opening many many windows, so removed both patches again.
> 
> > so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap,
> > since their pixmaps are created by calling
> > glXCreatePixmap.
> 
> Corrected occurences in textures.c, seems to work ok but can't tell about any
> differences yet.
> 
> 
> cat /proc/dri/0/gem_objects
> 1294 objects
> 153399296 object bytes
> 4 pinned
> 13770752 pin bytes
> 125480960 gtt bytes
> 260308992 gtt total
> 
> Objects increase quite a lot, but memory is not consumed that fast. Should the
> objects decrease after closing a window?
> 
> Using kernel-2.6.30.
> 
Thanks for your testing. Could you attach the backtrace of the crash.
With those two patches and corrected compiz, memory usage shall not increase much when you keep resizing one window on 945GM.

Comment 51 Shuang He 2009-06-18 00:58:13 UTC

an update version of patch in comment #44 has been commited.
the patch in comment #47 has problem with some compiz plug-in, and it's not a must to fix this memory leak issue, but the compiz fix is needed which is described in comment #48. 

I can't reproduce this issue any more on 945GM with fix in compiz, my configuration is:
Libdrm  (master)2fa2db138ba989bfa1a8cd9ab66d83fb7369249e
Mesa    (master)77506dac8e81e9548a7e9680ce367175fe5747af
Xserver         (master)14581afb474552716c02ca15220ca7050123c375
Xf86_video_intel        (master)b5cd2130f97591f4a387db1b98c940c30bc6404c
Kernel  (for-linus)0e7ddf7eeeef5aea85412120539ab5369577faeb

Comment 52 Dark Shadow 2009-06-18 02:20:45 UTC

I also updated to latest git recently (a week ago?) and do not experience the bug anymore, with the compiz fix described in comment #48 applied. I still have problems when memory usage is high, leading to pixmap corruption and finally freezing the system. I will open another bug report when I am able to reproduce this. So far, thanks for your help, everything works much better now.

Comment 53 Dark Shadow 2009-06-18 02:33:35 UTC

Perhaps I should also mention that I pulled Eric Anholt's drm branch from git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel.git
which contains quite a few changes, and I am using latest git now.

Comment 54 Colin Guthrie 2009-06-18 11:15:12 UTC

(In reply to comment #51)
> an update version of patch in comment #44 has been commited.

For packagers wanting to cherry-pick, the commit is: d027e8feff7d38cccadc6aaccc0454b21ce4dca0

Thanks for your work Shuang

Comment 55 Shuang He 2009-06-21 19:29:22 UTC

Verified

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.