Xserver 1.5.99.902 Intel driver git head (38a7683561cee7fffab174c2a166bfd51b51ba27) drm git head (a6dd0afa87558a670f970e61b023f45a396539eb) Ubuntu Jaunty (unstable) 64bit distribution GM965 HW. Kernel 2.6.29-rc6 git head (f7e603ad8f78cd3b59e33fa72707da0cbabdf699) Not using KMS. lsof shows many file descriptors used by the Xorg process with Intel driver. Shortly after startx: (lsof | grep drm | wc -l) 1848 right after VT switch: 2322 another VT switch: 2668 third VT switch: 3263 (this all within approx 2 minutes)
You're counting X's reference to "drm mm object" right?
(In reply to comment #1) > You're counting X's reference to "drm mm object" right? > yes, those are references to "drm mm object". But I believe that each reference corresponds to a single file descriptor. Moreover, the driver also leaks cache objects. Using filecache patch, I can see such objects in file cache: # filecache 1.0 # ino size cached cached% refcnt state dev file 1044256 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044255 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044254 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044253 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044252 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044251 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044250 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044249 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) 1044248 8 8 100 1 -- 00:08(tmpfs) /drm\040mm\040object\040(deleted) and the number of these object is growing. Right now it results in 500MB of undroppable cache.
Those files are where your pixmaps and other graphics objects are stored. They do not consume fds, but they are open files. We have longer lifetimes on these objects than we should, but there shouldn't be any actual leaks -- you'll reach a steady state at some point.
(In reply to comment #3) > Those files are where your pixmaps and other graphics objects are stored. They > do not consume fds, but they are open files. > > We have longer lifetimes on these objects than we should, but there shouldn't > be any actual leaks -- you'll reach a steady state at some point. > so, I should believe that I have 900MB cached of drm mm objects and about 10k of file descriptors is quite OK?
Not OK (unless your apps are allocating that much), but I'm trying to get at a question: Is it a leak (memory use increases continually, a problem I don't see), or is it that the steady state is too big (a problem I do see to a much more limited extent and which there are some potential fixes for). Of course, I don't know what your desktop environment is like, so it's hard to speculate on what you might be seeing.
Hi I'm seeing same problem - I'm using similar laptop as Lukas - T61. When I start plain Xorg only with xterm window and nothing else here is trace of cat /proc/dri/0/gem_objects after each switch between console and Xorg: 604 objects 103354368 object bytes 6 pinned 50532352 pin bytes 50622464 gtt bytes 218152960 gtt total 1190 objects 105766912 object bytes 6 pinned 50532352 pin bytes 50622464 gtt bytes 218152960 gtt total 1772 objects 57716736 object bytes 0 pinned 0 pin bytes 0 gtt bytes 218152960 gtt total 2359 objects 60260352 object bytes 6 pinned 50532352 pin bytes 50622464 gtt bytes 218152960 gtt total 2945 objects 62672896 object bytes 6 pinned 50532352 pin bytes 50622464 gtt bytes 218152960 gtt total I'm also seeing growing smaps for Xorg - full of these entries: 7fc6cee8c000-7fc6cee8d000 rw-s 00000000 00:08 34863 /drm mm object (deleted) Size: 4 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB I've tried to track the free routines in drm_gem.c - but they are not easy to track - but I assume there is probably path which leads to partial dealocation of the object - so it is released from a list of gem object - but it still keeps referenced pages. I hope it helps. BTW I've opened this RH bugzilla for the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=487552
I've added some debug prints to kernel driver - This is called during switch from console to Xorg and I assume this is what creates new gem objects Call Trace: [<ffffffffa04fb6f5>] drm_gem_handle_create+0xd5/0xe0 [drm] [<ffffffffa052fb75>] i915_gem_create_ioctl+0x65/0xc0 [i915] [<ffffffffa04f9c38>] ? drm_ioctl+0x248/0x360 [drm] [<ffffffffa04f9afe>] drm_ioctl+0x10e/0x360 [drm] [<ffffffffa052fb10>] ? i915_gem_create_ioctl+0x0/0xc0 [i915] [<ffffffff802f015c>] vfs_ioctl+0x7c/0xa0 [<ffffffff8054d178>] ? _spin_unlock_irqrestore+0x48/0x80 [<ffffffff802f04bb>] do_vfs_ioctl+0x33b/0x5d0 [<ffffffff8054dd83>] ? error_sti+0x5/0x6 [<ffffffff8020c8bc>] ? sysret_check+0x27/0x62 [<ffffffff8020c8bc>] ? sysret_check+0x27/0x62 [<ffffffff802f07d1>] sys_ioctl+0x81/0xa0 [<ffffffff8020c88b>] system_call_fastpath+0x16/0x1b There is called only one free for object when doing switch Xorg->console. So my assumption is - i915_gem_create_ioctl() allocates a lot of new objects - but only one of them is being released on the switch back. Also I'd like to add that all objects are released when the Xorg is killed. So it looks like there is no leak inside kernel driver. I hope you could find the problem easier with these hints
OK. I have several findings. I noticed that bo cache in libdrm-intel is currently unlimited which is questionable but OK. So I put limit to bo cache like this: diff --git a/libdrm/intel/intel_bufmgr_gem.c b/libdrm/intel/intel_bufmgr_gem.c index 9e49d7c..74ba8fd 100644 --- a/libdrm/intel/intel_bufmgr_gem.c +++ b/libdrm/intel/intel_bufmgr_gem.c @@ -1160,7 +1160,7 @@ drm_intel_bufmgr_gem_enable_reuse(drm_intel_bufmgr *bufmgr int i; for (i = 0; i < DRM_INTEL_GEM_BO_BUCKETS; i++) { - bufmgr_gem->cache_bucket[i].max_entries = -1; + bufmgr_gem->cache_bucket[i].max_entries = 256/(i+1); } } OK, I hoped this could help. And it helped. If I close firefox, sunbird, and psi applications, drm mm objects fall down to something like 1900 objects. Another question is, what, for the hell, the Xserver uses so many objects if running gnome-panel + gnome-terminal, but whatever. But thing that worries me, why the Xserver allocates about 4-6MB more of RSS memory at every VT switch? And why the Xserver allocates about 500 NEW drm mm objects at every VT switch? And last but not least observation, I put some fprints into libdrm-intel around bo allocs/frees which resulted in bad rendering artefacts (texts where random letter had different color and such). If I put the fprints (and I'm pretty sure that ONLY the fprints), everything seems to be find. So there could be same races?
Another findings. Setting limit on cache size causes lockups. The cursor is moving but nothing more happens. So it seems that bo reusing is a bit broken if any bo is really freed. The cause of drm mm object leaks is in the function: gen4_render_state_init This function is called in EnterVT. And it *always* creates 600 *new* BOs. It appears to never reuse BOs. If cache size is set to unlimited, no BO free actually happens.
btw, would it be sane to kick off BOs older than e.g., 5 minutes? instead of counter based limits?
Lukas, could this be closed after your fix? commit d4c64f01b9429a8fb314e43f40d1f02bb8aab30f Author: Lukas Hejtmanek <xhejtman@ics.muni.cz> Date: Wed Mar 4 17:33:27 2009 -0500 Fix serious memory leak at Enter/LeaveVT
Yes, I'm closing it.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.