Summary: | memory leak in `amdgpu_bo_create()` | ||
---|---|---|---|
Product: | DRI | Reporter: | Paul Menzel <pmenzel+bugs.freedesktop.org> |
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | czbd, pmenzel+bugs.freedesktop.org |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
See Also: |
https://bugzilla.kernel.org/show_bug.cgi?id=202537 https://bugs.freedesktop.org/show_bug.cgi?id=107899 |
||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
*** Bug 109390 has been marked as a duplicate of this bug. *** Created attachment 143157 [details]
Linux kernel messages (dmesg)
Does this also happen with 4.20.y? If not, can you bisect? Created attachment 145255 [details]
Galactic Civilizations III memleak log without DXVK
As far as I'm understanding the logs that I've gotten, this memory leak does still occur with Linux 5.2.11-arch1-1-ARCH and Mesa 1.9.15.
In my case, it is most prevalent when a Direct3D game is launched with the use of Wine accompanied by the DXVK translation layer that converts the D3D calls to Vulkan - just going to a game's main menu can eat up large amounts of memory, which are then never freed, not even as the game is closed, until caches are manually dropped with a command.
However, this seems to also occur to a much smaller extent with DXVK turned off; I attach a bcc memleak log that showcases the issue with the use of Galactic Civilizations III v3.9, as the smaller amounts of memory leaked when DXVK is not in use make tracing the exact call that permanently leaked memory easier - if I'm not anyhow mistaken, that would make it the one that leaked 68550656 bytes in this log.
Created attachment 145256 [details]
Galactic Civilizations III memleak log with DXVK
For comparison, I also attach a similar log that I made with the DXVK translation layer enabled, which caused the game to leak much larger amounts of memory, to the point of making it unplayable.
While Galactic Civilizations III is the only game which I've confirmed to permanently leak memory through this call when DXVK is not used, virtually all D3D games I've tried to translate to Vulkan so far have leaks like this; unfortunately, I don't currently have my hands on any native Vulkan production to test. In the logs I am only launching the game until it reaches the main menu, thus the leak is, well, pretty serious in my case... :)
The Vulkan driver reports itself as AMD RADV KAVERI (LLVM 8.0.1) 1.9.15.
Created attachment 145257 [details]
Galactic Civilizations III memleak log with DXVK
Apologies, looks like I had forgotten to update the methodology in several places of the DXVK memleak log - this one should be much more accurate.
The updated methodology had however, to my understanding, showcased something that I had not expected: apparently, the memory allocated by amdgpu_bo_create() does not actually accumulate in a linear fashion, instead, it seems like it is replaced the second time the game is launched. Because of that, there is a chance that more than the 65 megabytes were actually unavailable after the test without DXVK, perhaps a sum of all the amdgpu_bo_create() calls' allocations.
Created attachment 145288 [details]
DRM/Radeon glxgears memleak log
Took a while to perform some more tests, and it turns out that running glxgears with amdgpu also leaks memory - launching a hundred of glxgears instances leaks about 400 megabytes, only freed after they are killed and the caches are manually dropped with the command `echo 3 > /proc/sys/vm/drop_caches`.
Because glxgears does not need Vulkan support, it had also been possible for me to confirm that the massive persisting leak is definitely caused by the amdgpu driver - attached is a bcc memleak log of glxgears taken with the radeon driver.
On a side note, launching vkcube seems to leak memory with the described call at a very similar rate as well.
Created attachment 145289 [details]
DRM/AMDgpu glxgears memleak log
For comparison, I attach the bcc memleak log of glxgears taken with amdgpu.
As far as I'm able to test after bcc memleak decided to randomly stop working on my machine, the leak is gone with Linux 5.3.6-arch1-1-ARCH and Mesa 19.2.1. Great thanks to whoever managed to resolve this crippling issue! -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/679. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 143156 [details] Output of kmemleak With Linux 5.0-rc2+ the memory leaks below are reported by kmemleak. ``` unreferenced object 0xffff9f83850c5000 (size 2048): comm "gnome-shell", pid 569, jiffies 4294682217 (age 9133.583s) hex dump (first 32 bytes): 02 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................ 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000001aec1dd8>] amdgpu_bo_create+0x40/0x220 [amdgpu] [<000000007da39c30>] amdgpu_gem_object_create+0x9e/0x120 [amdgpu] [<00000000099484e9>] amdgpu_gem_create_ioctl+0x1d3/0x290 [amdgpu] [<000000009d8251d3>] drm_ioctl_kernel+0xa9/0xf0 [<0000000050b61811>] drm_ioctl+0x201/0x3a0 [<000000007c88aae3>] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [<0000000083291747>] do_vfs_ioctl+0xa4/0x630 [<00000000722b6176>] ksys_ioctl+0x60/0x90 [<000000001bfa30dc>] __x64_sys_ioctl+0x16/0x20 [<000000007862c966>] do_syscall_64+0x55/0x170 [<00000000a8eeee88>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<000000001242345f>] 0xffffffffffffffff unreferenced object 0xffff9f837cdd36c0 (size 64): comm "gnome-shell", pid 569, jiffies 4294682217 (age 9133.583s) hex dump (first 32 bytes): d4 23 b4 d5 ab e2 45 94 d0 53 43 86 83 9f ff ff .#....E..SC..... 01 00 00 00 04 00 00 00 60 ab cc 5c 83 9f ff ff ........`..\.... backtrace: [<000000000768e015>] ttm_bo_mem_space+0x41/0x4a0 [<00000000f11076b2>] ttm_bo_validate+0xc7/0x130 [<00000000c820992e>] ttm_bo_init_reserved+0x32f/0x390 [<00000000fcfd5ce2>] amdgpu_bo_do_create+0x1ed/0x420 [amdgpu] [<000000001aec1dd8>] amdgpu_bo_create+0x40/0x220 [amdgpu] [<000000007da39c30>] amdgpu_gem_object_create+0x9e/0x120 [amdgpu] [<00000000099484e9>] amdgpu_gem_create_ioctl+0x1d3/0x290 [amdgpu] [<000000009d8251d3>] drm_ioctl_kernel+0xa9/0xf0 [<0000000050b61811>] drm_ioctl+0x201/0x3a0 [<000000007c88aae3>] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [<0000000083291747>] do_vfs_ioctl+0xa4/0x630 [<00000000722b6176>] ksys_ioctl+0x60/0x90 [<000000001bfa30dc>] __x64_sys_ioctl+0x16/0x20 [<000000007862c966>] do_syscall_64+0x55/0x170 [<00000000a8eeee88>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<000000001242345f>] 0xffffffffffffffff ```