Summary: | Memory leak in somewhere in __glXDisp_DrawArrays | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ben Gamari <bgamari> |
Component: | Drivers/DRI/i965 | Assignee: | Xorg Project Team <xorg-team> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | major | ||
Priority: | medium | CC: | bugs-freedesktop, lists_ravi, mikko.cal |
Version: | git | Keywords: | NEEDINFO |
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Valgrind log of leaking Xorg after graceful shutdown
Valgrind log Another valgrind log Very preliminary fix (maybe) for memory leak |
Description
Ben Gamari
2008-06-11 18:51:37 UTC
Created attachment 17069 [details]
Valgrind log of leaking Xorg after graceful shutdown
Considering the rate of the leak, it's pretty serious. On Wed, Jun 11, 2008 at 18:51:39 -0700, bugzilla-daemon@freedesktop.org wrote: > Attached is a valgrind log from running gnome-terminal spewing text under a > compositing manager (compiz). As can be seen, a substantial amount of memory > (8MB) is lost in exaGlyphs despite a very short sample duration (5 minutes or > so). I believe this leak brought my laptop with 4GB of RAM to its knees > thrashing earlier today after only a few hours of use. Is this using pixman 0.11.x? If so, it might be the same as https://bugs.freedesktop.org/show_bug.cgi?id=16312. Strictly speaking, I'm running pixman from git but yes, that looks like it might be the issue. I'll try the patch and we'll find out soon enough. (In reply to comment #3) > On Wed, Jun 11, 2008 at 18:51:39 -0700, bugzilla-daemon@freedesktop.org wrote: > > > Attached is a valgrind log from running gnome-terminal spewing text under a > > compositing manager (compiz). As can be seen, a substantial amount of memory > > (8MB) is lost in exaGlyphs despite a very short sample duration (5 minutes or > > so). I believe this leak brought my laptop with 4GB of RAM to its knees > > thrashing earlier today after only a few hours of use. > > Is this using pixman 0.11.x? If so, it might be the same as > https://bugs.freedesktop.org/show_bug.cgi?id=16312. > Well, initial signs don't look too promising. After applying the patch, xorg still grows by a hell of a lot (a few tenths of a percent of my 4GB) every time I unmap and map a firefox window (which triggers the leak quite nicely, apparently). (Renaming this bug since the exaGlyphs leak seems to be taken care of) It seems that the pixman patch did help however, there is also another completely unrelated leak in mesa which is causing issues (as I mentioned earlier). Unfortunately, I can't isolate the exact source as for some reason, valgrind refuses to give line number information for mesa symbols. I've had this issue dozens of times before but still haven't found a solution. Is there some trick to getting mesa debug symbols to work properly with the Xorg module loader? It would be great if so. Otherwise, all I know is that the memory allocation is a calloc somewhere within a callee of __glXDisp_DrawArrays. Any ideas? - Ben (In reply to comment #5) > Well, initial signs don't look too promising. After applying the patch, xorg > still grows by a hell of a lot (a few tenths of a percent of my 4GB) every time > I unmap and map a firefox window (which triggers the leak quite nicely, > apparently). > (In reply to comment #6) > Otherwise, all I know is that the memory allocation is a calloc somewhere within > a callee of __glXDisp_DrawArrays. Any ideas? Maybe try with gdb or another leak debugging tool like memprof. > (In reply to comment #5) > > After applying the patch, xorg still grows by a hell of a lot (a few tenths of > > a percent of my 4GB) every time I unmap and map a firefox window (which > > triggers the leak quite nicely, apparently). So does the amount of memory apparently leaked in __glXDisp_DrawArrays correlate to the number of times you (un)map a Firefox window? I'm probably hit by this leak too. Unfortunately I have no idea how to use tools like valgrind and such. I'm running mesa, xserver from git together with xf86-video-ati also from git. Patching/downgrading pixman doesn't solve my leak. Bringing up and down a single window a couple times causes X to eat around 10 mb. If there's any more info I should provide, please let me know. (In reply to comment #7) > (In reply to comment #6) > > Otherwise, all I know is that the memory allocation is a calloc somewhere within > > a callee of __glXDisp_DrawArrays. Any ideas? > > Maybe try with gdb or another leak debugging tool like memprof. > > > > (In reply to comment #5) > > > After applying the patch, xorg still grows by a hell of a lot (a few tenths of > > a percent of my 4GB) every time I unmap and map a firefox window (which > > > triggers the leak quite nicely, apparently). > > So does the amount of memory apparently leaked in __glXDisp_DrawArrays > correlate to the number of times you (un)map a Firefox window? > Yes, memory usage grows every time the window is mapped/unmapped. Note that this only occurs under compiz. I'm not running compiz at the moment and memory usage is quite normal (steady at 4%). Moreover, I generally run compiz with the genie effect for minimize/restore, hence the DrawArrays (being used to draw the distorted window during the animation). Has anyone else experienced issues with mesa and debugging symbols? For some reason this has been a persistent issue in my debugging attempts and have greatly frustrated efforts. All symbols other than those in mesa are recognized just fine. Does the default mesa build strip symbols? (In reply to comment #9) > (In reply to comment #7) > > (In reply to comment #6) > > > (In reply to comment #5) > > > > After applying the patch, xorg still grows by a hell of a lot (a few tenths of > > a percent of my 4GB) every time I unmap and map a firefox window (which > > > > triggers the leak quite nicely, apparently). > > > > So does the amount of memory apparently leaked in __glXDisp_DrawArrays > > correlate to the number of times you (un)map a Firefox window? > > Yes, memory usage grows every time the window is mapped/unmapped. Note that that's not exactly what I asked. :) > Note that this only occurs under compiz. I'm not running compiz at the moment > and memory usage is quite normal (steady at 4%). Moreover, I generally run > compiz with the genie effect for minimize/restore, hence the DrawArrays > (being used to draw the distorted window during the animation). I can't seem to reproduce this - my X server's memory usage remains constant while minimizing and unminimizing a window a couple of times with the genie effect. This could indicate that the leak is caused by the Mesa driver (r300 here) or compiz(-fusion) (Debian 0.7.6 packages). > Does the default mesa build strip symbols? I don't think so, but maybe CFLAGS doesn't contain -g with your build configuration? It's not a Compiz bug, because maximizing+minimizing a window with Kwin, in KDE 4, takes 4mb ram every time for me, but only with "Desktop Effects" enabled... (In reply to comment #11) > It's not a Compiz bug, because maximizing+minimizing a window with Kwin, in KDE > 4, takes 4mb ram every time for me, but only with "Desktop Effects" enabled... I don't think we can be sure at this point that you're seeing the same problem as Ben. I just recompiled mesa with the newly discovered --enable-debug configure flag. I'll do another valgrind run as soon as I'm around a remote machine. (In reply to comment #10) > Note that that's not exactly what I asked. :) > > I can't seem to reproduce this - my X server's memory usage remains constant > while minimizing and unminimizing a window a couple of times with the genie > effect. This could indicate that the leak is caused by the Mesa driver (r300 > here) or compiz(-fusion) (Debian 0.7.6 packages). > > > Does the default mesa build strip symbols? > > I don't think so, but maybe CFLAGS doesn't contain -g with your build > configuration? > (In reply to comment #12) > (In reply to comment #11) > > It's not a Compiz bug, because maximizing+minimizing a window with Kwin, in KDE > > 4, takes 4mb ram every time for me, but only with "Desktop Effects" enabled... > > I don't think we can be sure at this point that you're seeing the same problem > as Ben. > (In reply to comment #12) > > I don't think we can be sure at this point that you're seeing the same problem > as Ben. > If only I could run X in Valgrind.. But I haven't figured out how to do it. Any ideas? This is a 3mb video that shows you what I'm talking about.. To me it looks the same leak Ben is experiencing. http://rapidshare.com/files/123836519/out-1.ogv.html (In reply to comment #14) > (In reply to comment #12) > > > > I don't think we can be sure at this point that you're seeing the same problem > > as Ben. > > > > If only I could run X in Valgrind.. But I haven't figured out how to do it. Any > ideas? > This is a 3mb video that shows you what I'm talking about.. To me it looks the > same leak Ben is experiencing. > http://rapidshare.com/files/123836519/out-1.ogv.html > It really helps to have another computer. I SSH in to the machine I'm testing and run "valgrind --leak-check=full --show-reachable=yes X > x-valgrind.txt" and in another screen terminal run the following script, #!/bin/bash export DISPLAY=:0 export LIBGL_ALWAYS_INDIRECT=1 compiz --replace ccp & gtk-window-decorator & gnome-terminal & firefox & I can then minimize and maximize firefox to my heart's delight (at least until I run out of memory ;-) ). When I'm done abusing mesa, I just Ctrl+Alt+Bksp, which I think should clean everything up. (In reply to comment #15) > > It really helps to have another computer. I SSH in to the machine I'm testing > and run "valgrind --leak-check=full --show-reachable=yes X > x-valgrind.txt" Yes, I can ssh into the machine... I get: Warning: Can't execute setuid/setgid executable: /usr/bin/X valgrind: /usr/bin/X: Permission denied I tried with root also, but still same error. > and in another screen terminal run the following script, > You mean another terminal in which machine? The one you ssh from? Thanks for helping me out! (In reply to comment #16) > (In reply to comment #15) > > > > It really helps to have another computer. I SSH in to the machine I'm testing > > and run "valgrind --leak-check=full --show-reachable=yes X > x-valgrind.txt" > > Yes, I can ssh into the machine... I get: > > Warning: Can't execute setuid/setgid executable: /usr/bin/X > valgrind: /usr/bin/X: Permission denied > > I tried with root also, but still same error. > > > > and in another screen terminal run the following script, > > > > You mean another terminal in which machine? The one you ssh from? > Thanks for helping me out! > When I SSH in, the first thing I do is start a screen session (take a look at man screen). It looks like valgrind just doesn't like running setuid executables. At the risk of screwing up your Xorg configuration, you might want to try clearing the setuid/setgid bit (chmod ugo-s /usr/bin/X /usr/bin/Xorg). That might help. Created attachment 17260 [details]
Valgrind log
See if this is any useful please?
(In reply to comment #18) > Created an attachment (id=17260) [details] > Valgrind log > > See if this is any useful please? > Well, judging by the following, looks like you have the same issue with mesa debugging symbols as I have, ==2154== 17,440 bytes in 4 blocks are definitely lost in loss record 124 of 134 ==2154== at 0x4C20454: calloc (vg_replace_malloc.c:397) ==2154== by 0x15B5ADB1: ??? ==2154== by 0x15B4F84B: ??? ==2154== by 0x15B52ECF: ??? ==2154== by 0x15BEF665: ??? ==2154== by 0x15BEFC26: ??? ==2154== by 0x15BE84D6: ??? ==2154== by 0x15BE3937: ??? ==2154== by 0x15C6F9B6: ??? ==2154== by 0x83C8D05: __glXDisp_Render (in /usr/lib64/opengl/xorg-x11/extensions/libglx.so) ==2154== by 0x83CCF61: __glXDispatch (in /usr/lib64/opengl/xorg-x11/extensions/libglx.so) ==2154== by 0x44EC23: Dispatch (in /usr/bin/Xorg) Regardless, I don't see any huge outstanding allocations in mesa. What what Xorg's memory usage by the time you terminated it? Created attachment 17300 [details]
Another valgrind log
I compiled mesa with --enable-debug, is this any better?
Ben, X memory usage depends on how many times I minimize/maximize a window.
As you can see from the video, it takes around 4mb each time..
(In reply to comment #20) > Created an attachment (id=17300) [details] > Another valgrind log > > I compiled mesa with --enable-debug, is this any better? Thanks for doing that. I've been meaning to try --enable-debug for some time. Anyways, strangely it doesn't appear that it helped the unidentified mesa symbols. Moreover, I'm not seeing any leaks that might be from compiz. The largest allocation I can find is 60k in __glXDRIscreenCreateDrawable. I also checked to see if killing compiz would cause Xorg's memory usage to drop again. As can be expected, a large portion of Xorg's increased memory consumption remains after compiz is killed. Regardless, on examining the log a bit more closely, it seems quite strange that 54MB are lost in NewModuleDesc. Looking back on my own results, I haven't seen similar leaks. The same goes for the 113MB in _XSERVTransMakeAllCOTSServerListeners. How are you killing the xserver? It seems possible it's not getting a chance to cleanup. > Ben, X memory usage depends on how many times I minimize/maximize a window. > As you can see from the video, it takes around 4mb each time.. > Hmm, interesting, in my case it seems to be more like 10MB. In that case, the leak seems like probably a function of the window size (I'm using a maximized 1920x1200 Firefox window). (In reply to comment #21) > As can be expected, a large portion of Xorg's increased memory consumption > remains after compiz is killed. And does it start growing again immediately if you start compiz again and trigger the leak? (In reply to comment #21) > > Regardless, on examining the log a bit more closely, it seems quite strange > that 54MB are lost in NewModuleDesc. Looking back on my own results, I haven't > seen similar leaks. The same goes for the 113MB in > _XSERVTransMakeAllCOTSServerListeners. How are you killing the xserver? It > seems possible it's not getting a chance to cleanup. > I kill it with CTRL+ALT+Backspace > > Ben, X memory usage depends on how many times I minimize/maximize a window. > > As you can see from the video, it takes around 4mb each time.. > > > Hmm, interesting, in my case it seems to be more like 10MB. In that case, the > leak seems like probably a function of the window size (I'm using a maximized > 1920x1200 Firefox window). > I don't use compiz, but Kwin in KDE 4. And the memory isn't coming back when I kill the whole KDE, unless I kill X, of course... And my resolution is 1280x800, so maybe that's why it's less than yours?? Well, tell me if there's something more I can do :) (In reply to comment #23) > (In reply to comment #21) > > > > Regardless, on examining the log a bit more closely, it seems quite strange > > that 54MB are lost in NewModuleDesc. Looking back on my own results, I haven't > > seen similar leaks. The same goes for the 113MB in > > _XSERVTransMakeAllCOTSServerListeners. How are you killing the xserver? It > > seems possible it's not getting a chance to cleanup. > > > > I kill it with CTRL+ALT+Backspace Hmm, anyone have any thoughts on the above allocations? I can't reproduce this and it seems a bit fishy. > > > > Ben, X memory usage depends on how many times I minimize/maximize a window. > > > As you can see from the video, it takes around 4mb each time.. > > > > > Hmm, interesting, in my case it seems to be more like 10MB. In that case, the > > leak seems like probably a function of the window size (I'm using a maximized > > 1920x1200 Firefox window). > > > > I don't use compiz, but Kwin in KDE 4. And the memory isn't coming back when I > kill the whole KDE, unless I kill X, of course... > And my resolution is 1280x800, so maybe that's why it's less than yours?? > Well, tell me if there's something more I can do :) > Yep, the resolution seems like a reasonable explanation. Weird stuff. Tonight I'll do another set of valgrind tests to see if I can get some better backtraces. Someone somewhere must have a good explanation concerning the missing symbols. It's just really frustrating trying to find that individual. Alright, there is definitely some inconsistency in the behavior. I just noticed while looking at top that there was a period where the Xorg's memory usage remained constant despite repeated minimizes/maximizes. Haha! With a combination of mtrace and gdb, it looks like I managed to find the leak. It appears to be in dri_bufmgr_fake.c:985 in dri_fake_emit_reloc(). if (reloc_fake->relocs == NULL) { reloc_fake->relocs = malloc(sizeof(struct fake_buffer_reloc) * MAX_RELOCS); } Any ideas where/why this isn't getting freed? If this helps, here is a gdb log of the function (although I don't know if this particular call leaked, I would suspect it did). Note how relocs is 0x0. If only I knew what a reloc was. Breakpoint 1, 0x00007f90901fc03a in dri_fake_emit_reloc (reloc_buf=0x2d669f0, flags=33554433, delta=0, offset=68, target_buf=0xe8f740) at ../common/dri_bufmgr_fake.c:985 985 reloc_fake->relocs = malloc(sizeof(struct fake_buffer_reloc) * (gdb) print (dri_bo_fake)*reloc_buf $3 = {bo = {size = 16384, offset = 18446744073709551615, virtual = 0x1203880, bufmgr = 0xb10f90}, id = 2821, name = 0x7f90903c628c "batchbuffer", dirty = 1, size_accounted = 1, card_dirty = 0, refcount = 1, flags = 0, alignment = 4096, is_static = 0 '\0', validated = 0 '\0', map_count = 1, validate_flags = 0, relocs = 0x0, nr_relocs = 0, block = 0x0, backing_store = 0x1203880, invalidate_cb = 0, invalidate_ptr = 0x0} (gdb) Backtrace, (gdb) bt #0 0x00007f90901fc03a in dri_fake_emit_reloc (reloc_buf=0x2e6e7a0, flags=33554435, delta=0, offset=0, target_buf=0xf78030) at ../common/dri_bufmgr_fake.c:985 #1 0x00007f909023d478 in prepare_wm_surfaces (brw=0xadc460) at brw_wm_surface_state.c:390 #2 0x00007f909022477e in brw_validate_state (brw=0xadc460) at brw_state_upload.c:223 #3 0x00007f90902197fd in brw_draw_prims (ctx=0xadc460, arrays=0xb34e58, prim=0x7fffaa473ff0, nr_prims=1, ib=0x0, min_index=0, max_index=11) at brw_draw.c:315 #4 0x00007f90902cf1b8 in vbo_exec_DrawArrays (mode=7, start=0, count=12) at vbo/vbo_exec_array.c:263 #5 0x00007f90a19dfc2c in __glXDisp_DrawArrays (pc=0x129d9dc "") at render2.c:248 #6 0x00007f90a19d9f46 in __glXDisp_Render (cl=<value optimized out>, pc=0x129d9ac "\030\001�") at glxcmds.c:1791 #7 0x00007f90a19de1c2 in __glXDispatch (client=0x9b3f80) at glxext.c:492 #8 0x000000000044e744 in Dispatch () at dispatch.c:448 #9 0x0000000000433ecd in main (argc=1, argv=0x7fffaa4742c8, envp=<value optimized out>) at main.c:415 Am I the only one who thinks that bugs #16316 and #16190 sharing dri_fake_emit_reloc is a little more than coincidence? Anyone have any input here. It seems like now that we have a backtrace, someone probably has some theories about the leak. Care to share? I spoke with anholt and jbarnes last night and it seems that dri_fake_bo_unreference() is missing a free. I have the change ready for testing but I haven't had a chance to test yet. Very preliminary untested patch attached. Created attachment 17479 [details] [review] Very preliminary fix (maybe) for memory leak Ben, the patch seems superfluous/incorrect if you look at line 684 where bo_fake->relocs is free'd. You want to do it before the debug statement. I hope I have missed something since this is a bug I badly want fixed. (In reply to comment #33) > Ben, the patch seems superfluous/incorrect if you look at line 684 where > bo_fake->relocs is free'd. You want to do it before the debug statement. I hope > I have missed something since this is a bug I badly want fixed. > Yep, you're absolutely right. That patch is useless (and will crash the server). Ben, could you check whether the patch in comment 6 (from krh) from the following mitigates your issue? https://bugzilla.redhat.com/show_bug.cgi?id=454117 That commit fixed it for me :) (In reply to comment #35) > Ben, could you check whether the patch in comment 6 (from krh) from the > following mitigates your issue? > https://bugzilla.redhat.com/show_bug.cgi?id=454117 > Ben, have tried the fix in this bug? time out. mark fixed per comment from mikko. Mass version move, cvs -> git |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.