Created attachment 91298 [details] Message log showing page allocation failure I finally managed to build some libdrm-2.4.50 RPMs for Fedora 19 so that I could compile a recent -git snapshot. The HEAD commit was: author Alex Deucher <alexander.deucher@amd.com> 2013-12-24 20:22:31 (GMT) committer Alex Deucher <alexander.deucher@amd.com> 2013-12-24 20:22:31 (GMT) commit e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363 (patch) (side-by-side diff) tree 51caa6afa79db9a733198f6bf5c6224c023fea2f parent 35a34143026785e015adb906756651807de89bde (diff) download mesa-e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363.zip mesa-e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363.tar.gz r600g: fix SUMO2 pci id 0x9649 is sumo2, not sumo. This resulted in an eventual crash of WoW and Minecraft, apparently due to memory exhaustion. (The messages log is attached). Reverting to my previous build seemed to fix things: author Marek Olšák <marek.olsak@amd.com> 2013-11-20 00:47:36 (GMT) committer Marek Olšák <marek.olsak@amd.com> 2013-11-23 00:54:57 (GMT) commit a3969aa125c8f61b093a5f5f69e8265a131051d0 (patch) (side-by-side diff) tree aaa0b9350231b29d9dd51c49c5c86e1fe78a6096 parent 46cf80fb366cb14827724a7fea004e81400cc602 (diff) download mesa-a3969aa125c8f61b093a5f5f69e8265a131051d0.zip mesa-a3969aa125c8f61b093a5f5f69e8265a131051d0.tar.gz mesa: initialize gl_renderbuffer::Depth in core
Is there a way to use Gallium's HUD to determine when memory is definitely *not* being leaked, please? Otherwise, I fear any attempt at bisecting this is doomed to fail...
Can you bisect?
(In reply to comment #2) > Can you bisect? See comment 1. I am also suspecting that it is Minecraft, not Warcraft, that is the root cause here, because I have since played WoW for a while without reproducing this. So now I'm thinking that maybe Minecraft leaked enough memory that WoW couldn't run.
(In reply to comment #1) > Is there a way to use Gallium's HUD to determine when memory is definitely > *not* being leaked, please? Otherwise, I fear any attempt at bisecting this > is doomed to fail... Run: GALLIUM_HUD=help glxgears You should see 2 queries: requested-VRAM and requested-GTT.
(In reply to comment #4) > You should see 2 queries: requested-VRAM and requested-GTT. When you say "requested-VRAM", does that measure the total VRAM currently allocated by the particular application? Or is it by all applications? Can you determine anything useful from the OOM message?
There is a failure in kmalloc, meaning that there may be a memory leak in the kernel. The Gallium HUD queries say how many bytes were allocated by the app (OpenGL, etc.) It doesn't reflect the current state of the kernel memory manager.
(In reply to comment #6) > There is a failure in kmalloc, meaning that there may be a memory leak in > the kernel. I had gathered that much already from the line saying "page allocation failure", but there are a lot of other lines such as these, which I am finding it harder to parse: Dec 26 12:59:32 landingpod kernel: [ 3977.929263] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15888kB Dec 26 12:59:32 landingpod kernel: [ 3977.929276] Node 0 DMA32: 2346*4kB (UEM) 2231*8kB (UEM) 2324*16kB (UEM) 1206*32kB (UEM) 24*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 104544kB Dec 26 12:59:32 landingpod kernel: [ 3977.929287] Node 0 Normal: 984*4kB (UEMR) 1938*8kB (UEMR) 914*16kB (UEMR) 8*32kB (MR) 3*64kB (R) 0*128kB 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 36048kB Dec 26 12:59:32 landingpod kernel: [ 3977.929300] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Dec 26 12:59:32 landingpod kernel: [ 3977.929301] 481313 total pagecache pages I presume they are providing *someone* with valuable diagnostic information, or else why are they there?
Created attachment 93322 [details] Memory being leaked with RV790 WoW is definitely leaking memory with current Mesa-git. I have attached the dmesg log that results after playing WoW for < 1 hour. Mesa head is: commit 9bace99d77642f8fbd46b1f0be025ad758f83f5e Author: Zack Rusin <zackr@vmware.com> Date: Tue Jan 28 16:34:18 2014 -0500 gallivm: fix opcode and function nesting but the problem began before this.
The memory leak I'm seeing with my RV790 doesn't seem to occur with this git revision: commit f5bd5568abcc234c1c2b6a4bb67b880706f3caed Author: Mark Mueller <MarkKMueller@gmail.com> Date: Tue Jan 21 22:37:20 2014 -0800 mesa: Fix Type A _INT formats to MESA_FORMAT naming standard I am therefore assuming that this is a different bug to #73127.
Does valgrind --leak-check=full give any hints where the leak is?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/479.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.