On latest mesa git (17.3-dev) WarThunder freezes with vsync activated. The main problem: a system consumes significantly more power (+90W in my case), with vsync deactivated. Switching back to mesa 17.2-rc5 or disabling vsync (vblanc=0), are solutions to make it work, atm. Here my system specs: (glxinfo |grep OpenGL) OpenGL vendor string: X.Org OpenGL renderer string: AMD Radeon (TM) R9 380 Series (TONGA / DRM 3.19.0 / 4.13.0-rc5+, LLVM 6.0.0) OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.3.0-devel (git-46a8c4ef81) OpenGL core profile shading language version string: 4.50 OpenGL core profile context flags: (none) OpenGL core profile profile mask: core profile OpenGL core profile extensions: OpenGL version string: 3.0 Mesa 17.3.0-devel (git-46a8c4ef81) OpenGL shading language version string: 1.30 OpenGL context flags: (none) OpenGL extensions: OpenGL ES profile version string: OpenGL ES 3.1 Mesa 17.3.0-devel (git-46a8c4ef81) OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10 OpenGL ES profile extensions:
Can you bisect which Mesa Git commit introduced the issue?
Yes, i did it via 'git bisect'. Here is the first related commit: d5ba75f8881f0869dc16f71f7395514c0a35b6e2 is the first bad commit commit d5ba75f8881f0869dc16f71f7395514c0a35b6e2 Author: Thomas Hellstrom <thellstrom@vmware.com> Date: Tue Jun 20 19:24:34 2017 +0200 st/dri2 Plumb the flush_swapbuffer functionality through to dri3 Implement the state tracker manager drawable interface flush_swapbuffer method by plumbing it through to dri3 if available. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> :040000 040000 8df730d2ac95b42435c96043da0eb6fba5f6861c 4179b3bb9a075169627eb00de5780bbbe8abea02 M src I hope it makes sense and can help you.
Thomas, any ideas? (In reply to haro41 from comment #2) > Yes, i did it via 'git bisect'. Thanks. Any chance you can get a backtrace[0] of the hanging process? [0] Ideally of all threads, something like "thread apply all bt full" in gdb.
Created attachment 133719 [details] gdb all tread backtrace
This looks odd. That commit actually only adds a wait for all swaps to be scheduled at glFinish(), so it shouldn't really be causing any grief unless the server somehow forgets to send the right events or the dri3 wait_for_sbc is broken...
BTW: ... setting environment variable LIBGL_DRI3_DISABLE (to switch back to DRI2) fixes the freeze too ...
Created attachment 133771 [details] [review] Patch to see if there might be a race causing this @haro41: Could you test the attached dri3_mutex.diff and see if there is a change in behaviour?
@Thomas, i got two rejects when trying to apply the patch. Let me sync to your base version first, to avoid additional diffs, where/when did you branch exactly?
Created attachment 133776 [details] [review] Replacement patch to see if there is a race causing this.
(In reply to haro41 from comment #8) > @Thomas, > > i got two rejects when trying to apply the patch. > > Let me sync to your base version first, to avoid additional diffs, > where/when did you branch exactly? My mistake. Added a new patch based on 0cc4c7e3.
i applied your patch successful, still the freezes, maybe in average a bit later now. The behavoir changed a bit: before patch: vblank_mode=2 (default)-> always freezes inside 0..2 minutes runtime, framerate fix/clamped at 50(as expected) vblank_mode=0 -> no freezes at all, dynamic, high framerates LIBGL_DRI3_DISABLE=1 -> no freezes at all, framerate fix at 50 after patch: vblank_mode=2 (default)-> always freezes inside 0..2 minutes runtime, framerate fix/clamped at 100(!!) vblank_mode=0 -> no freezes at all, dynamic, high framerates LIBGL_DRI3_DISABLE=1 -> no freezes at all, framerate fix at 50 To be honest, i am not familiar enough with DRM internals to understand what exactly happens here, but it looks like something is broken in respect to DRI 3 usage. Somehow i think i could be the only one with this freezes and to ensure i am not wasting your time: Can you give me a hint, where i should look first to exclude it is something specific to my system/setup?
(In reply to haro41 from comment #11) > i applied your patch successful, still the freezes, maybe in average a bit > later now. > > The behavoir changed a bit: > > before patch: > > vblank_mode=2 (default)-> always freezes inside 0..2 minutes runtime, > framerate fix/clamped at 50(as expected) > vblank_mode=0 -> no freezes at all, dynamic, high framerates > LIBGL_DRI3_DISABLE=1 -> no freezes at all, framerate fix at 50 > > > after patch: > > vblank_mode=2 (default)-> always freezes inside 0..2 minutes runtime, > framerate fix/clamped at 100(!!) > vblank_mode=0 -> no freezes at all, dynamic, high framerates > LIBGL_DRI3_DISABLE=1 -> no freezes at all, framerate fix at 50 > > > To be honest, i am not familiar enough with DRM internals to understand what > exactly happens here, but it looks like something is broken in respect to > DRI 3 usage. > > Somehow i think i could be the only one with this freezes and to ensure i am > not wasting your time: > Can you give me a hint, where i should look first to exclude it is something > specific to my system/setup? That's really weird :). Actually I don't think anything's wrong with your setup, but rather that there's a multithreading bug in dri3 or the app. There's no concurrency protection at all in the dri3 client and I'm not sure that's correct. I think you're the only one seeing this possibly perhaps because you're the first to try it with a heavily multithreaded application. Anyway, I'm OK with commenting out the glFinish() wait for swapbuffers until someone has the possibility to debug this thoroughly. Unfortunately WarThunder doesn't run on vmware's svga driver (yet) due to bugs... It would also be good to try to rule out server side radeon dri3 problems. Perhaps by running it on nouveau or intel...
Ok, that makes sense for me, thank you :)
(In reply to Thomas Hellström from comment #12) > It would also be good to try to rule out server side radeon dri3 problems. > Perhaps by running it on nouveau or intel... Or simply the modesetting Xorg driver. A server-side issue could be in the xserver Present code used by all drivers though.
... i found this related and interesting blog: https://keithp.com/blogs/DRM-lease-4/ Seems there is something WIP in respect to DRM synchronisation and this very bug.
DRM leases have nothing to do with this issue. Have you got a chance to test if this also happens with the Xorg modesetting driver?
@Michel, i did just now, but WarThunder freeze behavoir didn't really change. xorg.conf: Section "Device" Identifier "AMD" Driver "modesetting" EndSection DRI 3 is used per default too (X.Org X Server 1.19.3). BTW: i have tested with my older pitcairn (HD7870), trying amdgpu and radeon kernel driver. The behavoir is the same as with tonga in both cases.
FWIW, I got it running under dri3/vsync with the svga driver with no apparent issue. It also runs fine with modesetting/svga although there is no true vsync since the kernel module flips pages instantly.
What happens if you run in windowed mode + vsync?
@Thomas, i get freezes in windowed mode with activated vsync too (tried with latest git).
... looks like the reason for freezing, is a concurrent waiting in xcb_wait_for_special_event(..). While the main thread is waiting for present related events, another thread is consuming this events (because he was the first one entering the wait) and the main thread is waiting for ever (freeze). I will attach the debug log for some frames before the freeze. @Thomas, if my frame rate is lower (FPS < Monitor Sync, because of to much debug output), i don't get any freezes. Could this be the reason why you can't reproduce the freezes with svga-stack?
Created attachment 134297 [details] debug log: concurrent waiting in xcb_wait_for_special_event() This command's are used for logging (all in 'src/loader/loader_dri3_helper.c'): printf("%4x =>dri3_handle_present_event: XCB_PRESENT_COMPLETE_NOTIFY: serial:%u \n", (uint16_t)pthread_self(), ce->serial); printf("%4x =>dri3_handle_present_event: XCB_PRESENT_EVENT_IDLE_NOTIFY: pixmap:%u \n", (uint16_t)pthread_self(), ie->pixmap); printf("%4x =>xcb_wait_for_special_event in dri3_wait_for_event: send_sbc:%lu recv_sbc:%lu\n", (uint16_t)pthread_self(), draw->send_sbc, draw->recv_sbc); printf("%4x =>xcb_wait_for_special_event in dri3_find_back: send_sbc:%lu recv_sbc:%lu\n", (uint16_t)pthread_self(), draw->send_sbc, draw->recv_sbc); printf("%4x =>loader_dri3_swapbuffer_barrier: send_sbc:%lu recv_sbc:%lu\n", (uint16_t)pthread_self(), draw->send_sbc, draw->recv_sbc); '9240' is obviously the main thread.
Created attachment 134344 [details] [review] Patch to protect the loader_dri3_drawable struct So here is a patch that doesn't fully make dri3 drawables thread-safe, but it should at least make sure threads don't steal events from eachother. Please try, Thomas
I tested your patch (~20 minutes): No freezes at all, good work! I will continue later and meanwhile i'am trying to understand what the meanings of all that different xx_swap_buffers() functions/callbacks could be :) Thanks, Jens
Comment on attachment 134344 [details] [review] Patch to protect the loader_dri3_drawable struct OK, thanks, that's good to know. Note the patch isn't complete yet. Just enough to verify what the problem was.
Created attachment 134383 [details] protection in action, longer debug log adapted debug log (longer test), showing current protection at work ... No freezes and no other visible issues currently.
@Thomas, any chance to finally fix this for the soon released mesa 17.3?
Hi! We can probably pave over this specific problem for the release, but making dri3 fully thread-safe is a much larger task, which I will not have time for before the release. BTW are you running with mesa glthread? In that case, could you test with master mesa and export mesa_glthread=false /Thomas
I tried both: mesa_glthread=false/true, it doesn't make a difference in respect to this issue. It think other applications/games could be affected by this problem too, so maybe temporary reverting the changes in dri2_flush_swapbuffers() would make sense? (this is currently my approach to avoid the freezes)
Thanks for testing. But if I understand you correctly the "patch to protect the loader_dri3_drawable struct" fixes the issue on your side, right? If so, I'd rather push a somewhat polished version of that patch...
Yes, your last patch worked flawless here and if you could provide a polished version just let me know, i am ready to test it.
Slightly polished patch available here... https://lists.freedesktop.org/archives/mesa-dev/2017-November/175373.html
No freezes, works great for me.
(In reply to haro41 from comment #33) > No freezes, works great for me. Want to add a Tested-by: tag? /Thomas
(In reply to Thomas Hellström from comment #34) > (In reply to haro41 from comment #33) > > No freezes, works great for me. > > Want to add a Tested-by: tag? > > /Thomas ... if it helps, but where and how to add this tag?
(In reply to haro41 from comment #35) > (In reply to Thomas Hellström from comment #34) > > (In reply to haro41 from comment #33) > > > No freezes, works great for me. > > > > Want to add a Tested-by: tag? > > > > /Thomas > > ... if it helps, but where and how to add this tag? It's added by me to the commit message before pushing, to indicate that you've tested the patch. A tested by tag typically looks like Tested-by: Firstname Lastname <haro41@gmx.de> So if you want me to do that I'll need your first and last name. /Thomas
Ok, thanks for clarification. I prefer not to add such tag, because this is my anonymous email address, dedicated to things like to games. /Jens
Fix has now been pushed to mesa master.
Thank you, problem fully solved for me.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.