Created attachment 145472 [details]
Output of dmesg
I develop a tool which uses a separate thread for uploading textures to the GPU, in parallel to the rendering thread. These two threads are synchronized using OpenGL fences, which prevents the rendering to happen while a texture is being copied from a PBO.
On recent AMD hardware (tested on a Vega 56 and a Radeon VII) this setup hangs almost instantaneously. From my tests it seems that it waits for a glWaitSync to finish. The exact same code runs flawlessly on Intel (Mesa driver) and Nvidia (proprietary driver).
I managed to somewhat reproduce the issue in a simpler code, which merely creates two shared OpenGL contexts and does nothing except creating fences and waiting for the other thread. This example hangs with AMDGPU driver, but once again runs fine on Intel (Mesa driver) and Nvidia (proprietary driver).
I'll attach the code to this thread, and it can be found here too: https://gitlab.com/sat-metalab/splash/blob/fix/radeon_test/tests/sandbox/radeon_mesa_shared_context_freeze.cpp.
Created attachment 145473 [details]
Created attachment 145474 [details]
Source code exhibiting the issue
Created attachment 145487 [details]
output from gdb
Using the env var "GALLIUM_THREAD=0" makes the issue worse (the example hangs at the first iteration).
One app thread is stuck at: glWaitSync(_textureUploadFence, 0, GL_TIMEOUT_IGNORED);
The other thread is stuck waiting for the first thread to release the mutex. Before waiting for the mutex it made a call to: "_textureUploadFence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);"
All the mesa internal threads are waiting for work to do.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1430.