Summary: | [r600g] Memory leak when playing WoW with RV790 | ||
---|---|---|---|
Product: | Mesa | Reporter: | Chris Rankin <rankincj> |
Component: | Drivers/Gallium/r600 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
dmesg log showing memory allocation failure.
Valgrind output from 32 bit WoW 2nd valgrind output from 32 bit WoW apitrace output for 32 bit WoW (1) apitrace output for 32 bit WoW (2) apitrace output for 32 bit WoW (3) apitrace output for 32 bit WoW (4) apitrace output for 32 bit WoW (5) apitrace output for 32 bit WoW (6) apitrace output for 32 bit WoW (7) apitrace output for 32 bit WoW (8) dmesg output from 3.13.9 Xorg.0.log showing errors when exiting WoW Xorg backtrace when WoW fails to exit dmesg output with 3.14.2 |
It looks like the first bad commit is this one: commit ed42e95404a51298ea878a0d1cdcbc473612706a Author: Marek Olšák <marek.olsak@amd.com> Date: Wed Jan 22 02:49:53 2014 +0100 r600g,radeonsi: consolidate remaining obviously duplicated pipe_screen code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> I have definitely reproduced the problem with this commit, but failed to reproduce it with only the one before: commit 65dc588bfd3b8145131340ffe77f216be58378ac Author: Marek Olšák <marek.olsak@amd.com> Date: Wed Jan 22 02:42:20 2014 +0100 r600g,radeonsi: consolidate get_compute_param (In reply to comment #1) > It looks like the first bad commit is this one: > > commit ed42e95404a51298ea878a0d1cdcbc473612706a > Author: Marek Olšák <marek.olsak@amd.com> > Date: Wed Jan 22 02:49:53 2014 +0100 > > r600g,radeonsi: consolidate remaining obviously duplicated pipe_screen > code I don't think this is it. These functions are called once per process. (In reply to comment #2) > I don't think this is it. These functions are called once per process. Regardless, ed42e95404a51298ea878a0d1cdcbc473612706a is definitely "bad" whereas "65dc588bfd3b8145131340ffe77f216be58378ac" failed to crash after over 2 hours of play. I really cannot say anything else at this time. I have finally managed to generate an "out-of-memory" condition while playing WoW with git HEAD at: commit f5bd5568abcc234c1c2b6a4bb67b880706f3caed Author: Mark Mueller <MarkKMueller@gmail.com> Date: Tue Jan 21 22:37:20 2014 -0800 mesa: Fix Type A _INT formats to MESA_FORMAT naming standard So the bottom line is that I cannot bisect this, because not only do I have no reliable means of identifying a "GOOD" commit, but I also have no idea where the first "GOOD" commit might be. Please try getting more information about the leak(s) with valgrind --leak-check=full. (In reply to comment #5) > Please try getting more information about the leak(s) with valgrind > --leak-check=full. Do I need to do anything "special" to valgrind WoW.exe, seeing as it must be invoked using wine? I found this helpful for setting up wine and valgrind together: http://wiki.winehq.org/Wine_and_Valgrind You may need to recompile wine after installing valgrind, as mentioned in the wiki. For a similar issue in Diablo III, I could not get the game to run with valgrind, so I used apitrace to record a session and ran valgrind on the trace (after much help from Michel and Ilia). Hope this helps. Oh, make sure to compile Mesa with debug symbols or you'll need to repeat the whole process. I forgot that the first time 'round. (In reply to comment #7) > You may need to recompile wine after installing valgrind, as mentioned in > the wiki. There is no "re"-compile of wine - it either works with Fedora's debuginfo package or it doesn't. Has anyone ever "valground" a 32 bit executable on a box which is natively 64 bit, please? This bug is currently making it impossible to run Wow-64.exe: http://bugs.winehq.org/show_bug.cgi?id=35582 That in itself isn't an issue - the memory leak occurs with both 32 bit and 64 bit WoW. However, the following command is failing: $ valgrind --trace-children=yes --leak-check=full /usr/bin/wine /opt/wine/World\ of\ Warcraft/Wow.exe -opengl -noautoload64bit with this error: valgrind: failed to start tool 'memcheck' for platform 'amd64-linux': No such file or directory Which I assume means that Valgrind is trying to use 64 bit tools on a 32 bit executable. (In reply to comment #9) > valgrind: failed to start tool 'memcheck' for platform 'amd64-linux': No > such file or directory More specifically: Valgrind is falling back to the x86_64 platform when executing the "--trace-children=yes" option! Created attachment 95119 [details]
Valgrind output from 32 bit WoW
I have an extremely underpowered dual P4 box with a HD4670 AGP card that is capable of running 32 bit WoW, so I've tried to run valgrind on that. Here is the output.
Unfortunately, I was only able to get as far as the login screen as Blizzard rejected my login. (Too slow, perhaps? Or maybe they detected valgrind and disallowed me?) However, there are some interesting entries.
Created attachment 95121 [details] 2nd valgrind output from 32 bit WoW Again with the HD4670 AGP and the latest git: commit 079bff5a99fa19029fc0caba92fe57046ee29b23 Author: Anuj Phogat <anuj.phogat@gmail.com> Date: Mon Mar 3 14:40:14 2014 -0800 mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexIma try: $apitrace /usr/bin/wine /opt/wine/World\ of\ Warcraft/Wow.exe -opengl -noautoload64bit To record an apitrace of a little bit of play. It will generate a .trace file that you can run through valgrind (I believe same options as before, I can't recall). This may help you get a little further. (In reply to comment #13) > This may help you get a little further. Actually, I'm hoping that the valgrind output from WoW on my native 32 bit box will be sufficient. It does seem to show a suspiciously large number of allocations in the r600 code. (In reply to comment #14) > Actually, I'm hoping that the valgrind output from WoW on my native 32 bit > box will be sufficient. If you could produce an apitrace reproducing the leaks as reported by valgrind, that might make it easier for us to reproduce and investigate the problem. This one looks interesting, but I'm not sure yet how the memory ends up being leaked: ==13334== 302,736 bytes in 84 blocks are possibly lost in loss record 6,231 of 6,282 ==13334== at 0x400870E: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==13334== by 0x8444B00: r600_texture_create_object (r600_texture.c:571) ==13334== by 0x844587C: r600_texture_create (r600_texture.c:759) ==13334== by 0x843FF69: r600_resource_create_common (r600_pipe_common.c:589) ==13334== by 0x83DC5EF: r600_resource_create (r600_pipe.c:558) ==13334== by 0x81D7A98: st_texture_create (st_texture.c:96) ==13334== by 0x81AA32E: guess_and_alloc_texture (st_cb_texture.c:405) ==13334== by 0x81AA476: st_AllocTextureImageBuffer (st_cb_texture.c:459) ==13334== by 0x814C953: _mesa_store_compressed_teximage (texstore.c:4195) ==13334== by 0x81A99D4: st_CompressedTexImage (st_cb_texture.c:823) ==13334== by 0x8138540: teximage (teximage.c:3244) ==13334== by 0x813A8D3: _mesa_CompressedTexImage2D (teximage.c:3913) The apitrace is 155,721 KB and so cannot be uploaded. Rather than break it into > 50 fragments, does anyone have another location to upload it to please? Created attachment 95249 [details]
apitrace output for 32 bit WoW (1)
Created attachment 95250 [details]
apitrace output for 32 bit WoW (2)
Created attachment 95252 [details]
apitrace output for 32 bit WoW (3)
Created attachment 95253 [details]
apitrace output for 32 bit WoW (4)
Created attachment 95254 [details]
apitrace output for 32 bit WoW (5)
Created attachment 95255 [details]
apitrace output for 32 bit WoW (6)
Created attachment 95256 [details]
apitrace output for 32 bit WoW (7)
Created attachment 95257 [details]
apitrace output for 32 bit WoW (8)
This is a less ambitious apitrace from 32 bit WoW. You can reconstruct the original file by:
$ cat wine-preloader-2.trace.xz.* > wine-preloader-2.trace.xz
The SHA1SUM should be: c969eb3169db84e26e50f29a4e4674058e5ec897
Created attachment 97051 [details]
dmesg output from 3.13.9
This memory leak is still present with 3.13.9 and HEAD 4ccff1499c956b51f18710c7308cbce883f64cd9.
Does the patch from bug 74868 help? Thanks, I'll give it a try. Was Mesa leaking memory for *all* failed shaders, or just failed geometry shaders?
> I tried booting up Diablo III a few days back to see the new geometry shaders
> in action on my 4890...
AFAIK, the 4890 needs a 3.14+ kernel to get geometry shader support.
(In reply to comment #26) > Does the patch from bug 74868 help? Hmm, it hasn't OOM-ed yet. But one of the symptoms that I'd come to associate with the memory problem was an increasing jerkiness in the game play over time. That symptom at least is still present. With kernel 3.15, you can watch GPU memory usage by setting: GALLIUM_HUD=VRAM-usage,GTT-usage You should able to see if we leak GPU memory or not. (In reply to comment #29) > With kernel 3.15, you can watch GPU memory usage by setting: > GALLIUM_HUD=VRAM-usage,GTT-usage Is this support sufficiently non-invasive to be backported to 3.14-stable? It's not a bug fix, so I doubt it would be accepted. (In reply to comment #31) > It's not a bug fix, so I doubt it would be accepted. Perhaps not, but possibly worth asking Mr Greg KH if he'd be prepared to consider it anyway? Created attachment 97325 [details]
Xorg.0.log showing errors when exiting WoW
One of the other errors that I've come to associate (rightly or wrongly) with the OOM problem is that it can take a long time to get keyboard/mouse control back after exiting WoW.
This is the Xorg.0.log file from an instance where I didn't get keyboard/mouse control back at all, and Xorg just chewed up 100% of one CPU instead.
(In reply to comment #33) > This is the Xorg.0.log file from an instance where I didn't get > keyboard/mouse control back at all, and Xorg just chewed up 100% of one CPU > instead. DRICloseScreen is a DRI1 function; I suspect the backtraces in the Xorg log file aren't reliable. It would be interesting to see where the Xorg process is spinning, e.g. by attaching gdb to it and getting a couple of backtraces from it. (In reply to comment #32) > Perhaps not, but possibly worth asking Mr Greg KH if he'd be prepared to > consider it anyway? Sure, feel free to ask him. :) Anyway, any GPU resource leaks should be accompanied by 'normal' memory leaks, so valgrind should be the proper tool for the job. Created attachment 97584 [details]
Xorg backtrace when WoW fails to exit
The problem with Xorg happened again, so I logged in via another machine and extracted a backtrace. And it appears to be spinning uselessly here:
0x00000000005732b4 in DRI2DrawableGone (p=0x1977780, id=1117838198) at dri2.c:382
382 xorg_list_for_each_entry_safe(ref, next, &pPriv->reference_list, link) {
(In reply to comment #35) > [...] it appears to be spinning uselessly here: > > 0x00000000005732b4 in DRI2DrawableGone (p=0x1977780, id=1117838198) at > dri2.c:382 > 382 xorg_list_for_each_entry_safe(ref, next, &pPriv->reference_list, > link) { Weird. Anyway, that doesn't seem directly related to memory leaks in r600g and should be tracked separately. Created attachment 98185 [details]
dmesg output with 3.14.2
Drat, I had hoped that this issue had been fixed.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/491. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 93417 [details] dmesg log showing memory allocation failure. I am seeing a possible memory leak while playing WoW-64.exe with my RV790. The problem seems to happen after ~ 1 hour of play: [ 117.896993] fuse init (API version 7.22) [ 3326.401752] WoW-64.exe: page allocation failure: order:4, mode:0x10c0d0 [ 3326.407099] CPU: 7 PID: 31106 Comm: WoW-64.exe Not tainted 3.12.9 #1 [ 3326.412185] Hardware name: Gigabyte Technology Co., Ltd. EX58-UD3R/EX58-UD3R, BIOS FB 05/04/2009 [ 3326.419812] ffff8801afdcdbd0 ffffffff812d0071 0000000000000001 ffffffff810a0c50 [ 3326.426142] 0000000000000001 0000000000000000 ffffffff8164ff80 ffffffff8164f400 [ 3326.432429] 000000000010c0d0 ffffffff812cebb7 ffffffff8164f400 ffff880100000000 I *think* I can bisect this, although it might make some time: 9bace99d77642f8fbd46b1f0be025ad758f83f5e BAD f5bd5568abcc234c1c2b6a4bb67b880706f3caed GOOD