Bug 100024

Summary: [radeonsi] Failed to find memory space for buffer eviction when calling glTexSubImage2D with 16384 / 2
Product: Mesa Reporter: Julien Isorce <julien.isorce>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: alexdeucher, ckoenig.leichtzumerken
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: new piglit test max-texture-size2 to reproduce the problem
lspic -s -v
dmesg
Xorg.log
max-texture-size2 backtrace untill ENOMEM
default output of max-texture-size2 test
output of max-texture-size2 test with MESA_VERBOSE
output of max-texture-size2 test with MESA_VERBOSE and R600_DEBUG
output of max-texture-size2 test with USE_FENCE workaround
patch for mesa to translate the radeonsi printf ENOMEM to a proper GL_OUT_OF_MEMORY
output of max-texture-size2 test with attached patch for mesa
new piglit test max-texture-size2 to reproduce the problem

Description Julien Isorce 2017-03-01 17:45:25 UTC
Created attachment 130005 [details] [review]
new piglit test max-texture-size2 to reproduce the problem

/ [AMD/ATI] Cape Verde PRO [FirePro W600] /

The piglit test max-texture-size2 prints:

GL_TEXTURE_RECTANGLE, Internal Format = GL_RGBA8, Largest Texture Size = 16384
radeon: Not enough memory for command submission.
[drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to parse relocation -12! (ENOMEM)
PIGLIT: {"subtest": {"GL_TEXTURE_RECTANGLE-GL_RGBA8" : "pass"}}

It says it passes but internally it fails. It looks like a bug from the TTM GPU memory manager subsystem in the linux kernel.

To reproduce just run:
RADEON_THREAD=false DISPLAY=:0 PIGLIT_SOURCE_DIR=/home/julien/dev/piglit/ PIGLIT_PLATFORM=mixed_glx_egl ./bin/max-texture-size -fbo -auto

I set RADEON_THREAD=false to make it easier since the problem also appears with true.

I attached a new and more minimal piglit test "max-texture-size2" that reproduces the problem.
A workaround of the problem is to use GL fence as the new test also demonstrates setting the env var USE_FENCE.
Comment 1 Julien Isorce 2017-03-01 17:46:18 UTC
Created attachment 130006 [details]
lspic -s -v
Comment 2 Julien Isorce 2017-03-01 17:46:37 UTC
Created attachment 130008 [details]
dmesg
Comment 3 Julien Isorce 2017-03-01 17:47:04 UTC
Created attachment 130009 [details]
Xorg.log
Comment 4 Julien Isorce 2017-03-01 17:47:51 UTC
Created attachment 130011 [details]
max-texture-size2 backtrace untill ENOMEM
Comment 5 Julien Isorce 2017-03-01 17:48:47 UTC
Created attachment 130012 [details]
default output of max-texture-size2 test
Comment 6 Julien Isorce 2017-03-01 17:49:25 UTC
Created attachment 130013 [details]
output of max-texture-size2 test with MESA_VERBOSE
Comment 7 Julien Isorce 2017-03-01 17:49:52 UTC
Created attachment 130014 [details]
output of max-texture-size2 test with MESA_VERBOSE and R600_DEBUG
Comment 8 Julien Isorce 2017-03-01 17:50:39 UTC
Created attachment 130015 [details]
output of max-texture-size2 test with USE_FENCE workaround
Comment 9 Julien Isorce 2017-03-01 17:52:07 UTC
Created attachment 130016 [details] [review]
patch for mesa to translate the radeonsi printf ENOMEM to a proper GL_OUT_OF_MEMORY
Comment 10 Julien Isorce 2017-03-01 17:52:53 UTC
Created attachment 130018 [details]
output of max-texture-size2 test with attached patch for mesa
Comment 11 Michel Dänzer 2017-03-02 01:28:58 UTC
I don't think there's a kernel-side bug here — it simply runs out of graphics memory, and correctly reports that to userspace.
Comment 12 Julien Isorce 2017-03-02 09:50:38 UTC
Hi Michel. Thx you for your comment. In a way it is a good and only my attached mesa will be needed to fix the test. I will send it to the mailing list.

But I still do not properly understand why there is not enough graphic memory because lspci reports 2048M. And there are some contradictions depending on the subSize for the test sequence which is:

for i in 0 1:
  glGenTextures
  glTexImage2D    maxSize
  glTexSubImage2D subSize = maxSize / 2.
  glDeleteTextures

A: The kernel trace ENOMEM appears with:
  subSize = maxSize - 1 -> "radeon: Failed to allocate a buffer"
  subSize = 3 * maxSize / 4 -> "radeon: Failed to allocate a buffer"
  subSize = maxSize / 2 -> "radeon: Not enough memory for command submission"

B: It works with:
  subSize = maxSize
  subSize = maxSize / 4

Isn't A and B a contradiction ? For me A should succeeds because glDeteteTextures should . Also note that on other 

Also by calling glFush() within an additional loop iteration i = 2, it can recover so it will fail for iteration i = 1 but succeeds for i = 2. 
So what could the explanation here ? Should the driver just flush the cs automatically in that case like it does for subSize = maxSize ? Sorry I am just trying to understand :). Thx!
Comment 13 Julien Isorce 2017-03-02 09:52:17 UTC
Oups the part "because glDeteteTextures should . Also note that on other" of my comment should have been removed.
Comment 14 Julien Isorce 2017-03-02 09:56:19 UTC
Created attachment 130023 [details] [review]
new piglit test max-texture-size2 to reproduce the problem

Typo fix "maxSide -> maxSize" and some cleanup.
Comment 15 GitLab Migration User 2019-09-25 17:57:31 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1257.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.