Bug 111784 - Hang when using glWaitSync with multithreaded shared GL contexts
Summary: Hang when using glWaitSync with multithreaded shared GL contexts
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: high normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
Depends on:
Reported: 2019-09-23 14:21 UTC by Emmanuel Durand
Modified: 2019-09-25 18:50 UTC (History)
0 users

See Also:
i915 platform:
i915 features:

Output of dmesg (78.48 KB, text/plain)
2019-09-23 14:21 UTC, Emmanuel Durand
Xorg log (3.89 MB, text/x-log)
2019-09-23 14:22 UTC, Emmanuel Durand
Source code exhibiting the issue (5.56 KB, text/x-c++src)
2019-09-23 14:23 UTC, Emmanuel Durand
output from gdb (5.40 KB, text/plain)
2019-09-24 07:40 UTC, Pierre-Eric Pelloux-Prayer

Description Emmanuel Durand 2019-09-23 14:21:18 UTC
Created attachment 145472 [details]
Output of dmesg

I develop a tool which uses a separate thread for uploading textures to the GPU, in parallel to the rendering thread. These two threads are synchronized using OpenGL fences, which prevents the rendering to happen while a texture is being copied from a PBO.

On recent AMD hardware (tested on a Vega 56 and a Radeon VII) this setup hangs almost instantaneously. From my tests it seems that it waits for a glWaitSync to finish. The exact same code runs flawlessly on Intel (Mesa driver) and Nvidia (proprietary driver).

I managed to somewhat reproduce the issue in a simpler code, which merely creates two shared OpenGL contexts and does nothing except creating fences and waiting for the other thread. This example hangs with AMDGPU driver, but once again runs fine on Intel (Mesa driver) and Nvidia (proprietary driver).

I'll attach the code to this thread, and it can be found here too: https://gitlab.com/sat-metalab/splash/blob/fix/radeon_test/tests/sandbox/radeon_mesa_shared_context_freeze.cpp.
Comment 1 Emmanuel Durand 2019-09-23 14:22:05 UTC
Created attachment 145473 [details]
Xorg log
Comment 2 Emmanuel Durand 2019-09-23 14:23:24 UTC
Created attachment 145474 [details]
Source code exhibiting the issue
Comment 3 Pierre-Eric Pelloux-Prayer 2019-09-24 07:40:07 UTC
Created attachment 145487 [details]
output from gdb

Using the env var "GALLIUM_THREAD=0" makes the issue worse (the example hangs at the first iteration).

One app thread is stuck at: glWaitSync(_textureUploadFence, 0, GL_TIMEOUT_IGNORED);

The other thread is stuck waiting for the first thread to release the mutex. Before waiting for the mutex it made a call to: "_textureUploadFence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);"

All the mesa internal threads are waiting for work to do.
Comment 4 GitLab Migration User 2019-09-25 18:50:52 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1430.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.