Bug 3670

Summary: Memory leaks in python bindings
Product: dbus Reporter: Daniel Stone <daniel>
Component: coreAssignee: John (J5) Palmieri <johnp>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: sean.meiners
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Python binding memory leak test case.
A (dubious?) patch to fix the leak.

Description FreeDesktop Bugzilla Database Corruption Fix User 2005-06-30 17:17:53 UTC
There are a few memory leaks in the python bindings (pulled from CVS on June 17  
2005).  I've managed to plug a couple of them by adding destructors with *_unref calls  
to the Connection, PendingCall and Message classes (see attached patch).  However,  
there is still a leak I haven't been able track down that I can only see on the client side.  
  
Specifically: I have two apps using the python bindings.  The first is a daemon/server  
that creates a Service and waits.  The other is a client that repeatedly makes calls to  
the exposed API of the daemon which returns a block of data.  In this situation, the 
memory size of the daemon grows initially, but quickly stabilizes.  The client, however, 
continues to grow indefinitely.
Comment 1 FreeDesktop Bugzilla Database Corruption Fix User 2005-06-30 17:20:05 UTC
Created attachment 2999 [details] [review]
xf86-video-i810-1.4.0-Makefile.am.patch
Comment 2 John (J5) Palmieri 2005-07-02 14:20:31 UTC
Why aren't python bugs just assigned to me :-)
Comment 3 John (J5) Palmieri 2005-07-12 06:52:10 UTC
In CVS shortly.
Comment 4 FreeDesktop Bugzilla Database Corruption Fix User 2005-07-28 11:21:45 UTC
Created attachment 3168 [details]
pdf that includes a wrongly displayed graphic

This patch doesn't seem entirely correct to me based on my knowledge of the
dbus API, but it does in fact plug the last leak that's been plaguing my code. 
I would highly recommend testing of the other send_with_* functions as they
could very well have similar problems.	If you like I can also attach my test
cases to this bug.

PS: I could open another bug for this patch, but it seems to fall nicely under
this one (which I opened anyway).
Comment 5 FreeDesktop Bugzilla Database Corruption Fix User 2005-07-28 11:22:43 UTC
Silly bugzilla won't let me reopen this without a comment, despite the fact I just entered one 
(with the patch). 
Comment 6 John (J5) Palmieri 2005-07-29 01:37:53 UTC
My guess here is that the patch is wrong but the return message is only getting
reclaimed when we reenter the mainloop or at some point after it is being used.
 I think this will trash the stack in some situartions.  What most likely is the
problem is that the returned Message wrapper isn't being reclamed by the
refcounter or garbage collector.  If it was it would do the unref of the message
at the correct time.  Will look more into it.
Comment 7 FreeDesktop Bugzilla Database Corruption Fix User 2005-07-29 03:18:52 UTC
Created attachment 3182 [details]
X11 config file
Comment 8 FreeDesktop Bugzilla Database Corruption Fix User 2005-07-29 03:28:17 UTC
If you run the client & server tests I've attached and watch them in top you should see 
leak-client.py quickly use massive amounts of RAM until it eventually freezes and dies with a 
dbus timeout exception.  The server remains stable and can be reused with no side effects. 
 
Just for grins, I backed out the dangerous looking message_unref and re-tried the test using 
gtk.main() and a gobject.timer instead of time.sleep() and the end result was the same as 
describe above.  So I think we can safely rule out it being an issue with not returning to the 
mainloop. 
Comment 9 John (J5) Palmieri 2005-08-17 10:27:59 UTC
do you have that simple testcase?  The hard drive meltdown lost the attachment
Comment 10 Sean Meiners 2005-08-25 16:22:34 UTC
Created attachment 3036 [details]
Python binding memory leak test case.

Wow, it's lucky I happened to drop by and check the status of this bug. Since
I'm no longer marked as the reporter I don't get updates (so I just added
myself to the CC list).  Anyway, here's the test case.
Comment 11 Sean Meiners 2005-08-25 16:26:07 UTC
Created attachment 3037 [details] [review]
A (dubious?) patch to fix the leak.

Here's the patch that fixes the leak shown by the previous attachment.	As
discussed before the melt-down, it seems a bit risky, but it's been quite
stable for me over these last few weeks.
Comment 12 John (J5) Palmieri 2005-08-25 16:34:10 UTC
Wow it is only in the client but it is pretty dramatic.  I have to figure out
how to add this to the regression test.  I'm going to look at the codepath to
see if I can spot the problem.  It could be a stupid Pyrex thing as I have seen
issues on the mailing list.  Unreffing the message is almost certantly wrong
since it will be used later in the code and there is nothing that should ref it
a second time that I can see.
Comment 13 John (J5) Palmieri 2005-08-25 19:49:29 UTC
Ok, I fixed it and am going to check it in soon.  You won't belive the fix. 
Basicly __del__ wasn't being called on the object so the c return messages
wasn't getting unreffed which is why your patch seemed to work though for the
life of me I don't understand why it wasn't crashing.  

Anyway this wasn't happening in any of the other messages we created as they
were all being destroyed properly.  So I looked at all them and tryed to see
what was different.  After four hours I finally noticed that when creating all
the other objects I was using a subclass of Message so I subclassed Message to
EmptyMessage and lo and behold the things started calling __del__.  

I really need to rewrite the backend in C.  Ok, enough of my bitching.  Should
be in CVS soon.

 
Comment 14 John (J5) Palmieri 2005-08-25 19:50:25 UTC
you were taken off the CC list again

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.