Bug 73127

Summary: [r600g] Possible memory leak when playing WoW with CAICOS
Product: Mesa Reporter: Chris Rankin <rankincj>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Message log showing page allocation failure
Memory being leaked with RV790

Description Chris Rankin 2013-12-29 18:40:40 UTC
Created attachment 91298 [details]
Message log showing page allocation failure

I finally managed to build some libdrm-2.4.50 RPMs for Fedora 19 so that I could compile a recent -git snapshot. The HEAD commit was:

author	Alex Deucher <alexander.deucher@amd.com>	2013-12-24 20:22:31 (GMT)
committer	Alex Deucher <alexander.deucher@amd.com>	2013-12-24 20:22:31 (GMT)
commit	e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363 (patch) (side-by-side diff)
tree	51caa6afa79db9a733198f6bf5c6224c023fea2f
parent	35a34143026785e015adb906756651807de89bde (diff)
download	mesa-e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363.zip
mesa-e2d53fac1c5b18f5c9e95d39d4e2be4703b0b363.tar.gz

r600g: fix SUMO2 pci id
0x9649 is sumo2, not sumo.

This resulted in an eventual crash of WoW and Minecraft, apparently due to memory exhaustion. (The messages log is attached).

Reverting to my previous build seemed to fix things:

author	Marek Olšák <marek.olsak@amd.com>	2013-11-20 00:47:36 (GMT)
committer	Marek Olšák <marek.olsak@amd.com>	2013-11-23 00:54:57 (GMT)
commit	a3969aa125c8f61b093a5f5f69e8265a131051d0 (patch) (side-by-side diff)
tree	aaa0b9350231b29d9dd51c49c5c86e1fe78a6096
parent	46cf80fb366cb14827724a7fea004e81400cc602 (diff)
download	mesa-a3969aa125c8f61b093a5f5f69e8265a131051d0.zip
mesa-a3969aa125c8f61b093a5f5f69e8265a131051d0.tar.gz
mesa: initialize gl_renderbuffer::Depth in core
Comment 1 Chris Rankin 2013-12-29 18:47:29 UTC
Is there a way to use Gallium's HUD to determine when memory is definitely *not* being leaked, please? Otherwise, I fear any attempt at bisecting this is doomed to fail...
Comment 2 Alex Deucher 2014-01-02 17:45:12 UTC
Can you bisect?
Comment 3 Chris Rankin 2014-01-02 17:58:00 UTC
(In reply to comment #2)
> Can you bisect?

See comment 1.

I am also suspecting that it is Minecraft, not Warcraft, that is the root cause here, because I have since played WoW for a while without reproducing this. So now I'm thinking that maybe Minecraft leaked enough memory that WoW couldn't run.
Comment 4 Marek Olšák 2014-01-02 18:18:25 UTC
(In reply to comment #1)
> Is there a way to use Gallium's HUD to determine when memory is definitely
> *not* being leaked, please? Otherwise, I fear any attempt at bisecting this
> is doomed to fail...

Run: GALLIUM_HUD=help glxgears

You should see 2 queries: requested-VRAM and requested-GTT.
Comment 5 Chris Rankin 2014-01-02 18:28:20 UTC
(In reply to comment #4)
> You should see 2 queries: requested-VRAM and requested-GTT.

When you say "requested-VRAM", does that measure the total VRAM currently allocated by the particular application? Or is it by all applications? Can you determine anything useful from the OOM message?
Comment 6 Marek Olšák 2014-01-02 19:34:05 UTC
There is a failure in kmalloc, meaning that there may be a memory leak in the kernel.

The Gallium HUD queries say how many bytes were allocated by the app (OpenGL, etc.) It doesn't reflect the current state of the kernel memory manager.
Comment 7 Chris Rankin 2014-01-02 23:35:03 UTC
(In reply to comment #6)
> There is a failure in kmalloc, meaning that there may be a memory leak in
> the kernel.

I had gathered that much already from the line saying "page allocation failure", but there are a lot of other lines such as these, which I am finding it harder to parse:

Dec 26 12:59:32 landingpod kernel: [ 3977.929263] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15888kB
Dec 26 12:59:32 landingpod kernel: [ 3977.929276] Node 0 DMA32: 2346*4kB (UEM) 2231*8kB (UEM) 2324*16kB (UEM) 1206*32kB (UEM) 24*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 104544kB
Dec 26 12:59:32 landingpod kernel: [ 3977.929287] Node 0 Normal: 984*4kB (UEMR) 1938*8kB (UEMR) 914*16kB (UEMR) 8*32kB (MR) 3*64kB (R) 0*128kB 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 36048kB
Dec 26 12:59:32 landingpod kernel: [ 3977.929300] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 26 12:59:32 landingpod kernel: [ 3977.929301] 481313 total pagecache pages

I presume they are providing *someone* with valuable diagnostic information, or else why are they there?
Comment 8 Chris Rankin 2014-02-03 21:31:22 UTC
Created attachment 93322 [details]
Memory being leaked with RV790

WoW is definitely leaking memory with current Mesa-git. I have attached the dmesg log that results after playing WoW for < 1 hour.

Mesa head is:

commit 9bace99d77642f8fbd46b1f0be025ad758f83f5e
Author: Zack Rusin <zackr@vmware.com>
Date:   Tue Jan 28 16:34:18 2014 -0500

    gallivm: fix opcode and function nesting

but the problem began before this.
Comment 9 Chris Rankin 2014-02-04 21:05:25 UTC
The memory leak I'm seeing with my RV790 doesn't seem to occur with this git revision:

commit f5bd5568abcc234c1c2b6a4bb67b880706f3caed
Author: Mark Mueller <MarkKMueller@gmail.com>
Date:   Tue Jan 21 22:37:20 2014 -0800

    mesa: Fix Type A _INT formats to MESA_FORMAT naming standard

I am therefore assuming that this is a different bug to #73127.
Comment 10 Michel Dänzer 2014-02-12 10:16:38 UTC
Does valgrind --leak-check=full give any hints where the leak is?
Comment 11 GitLab Migration User 2019-09-18 19:12:28 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/479.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.