Summary: | Rendering errors when running dolphin-emu with Vulkan backend, radv (Super Smash Bros. Melee) | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ben Clapp <benclapp55> |
Component: | Drivers/Vulkan/radeon | Assignee: | mesa-dev |
Status: | RESOLVED FIXED | QA Contact: | mesa-dev |
Severity: | normal | ||
Priority: | medium | CC: | freebugs, jan.public, sa |
Version: | 17.3 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
text file containing output from glxinfo and vulkaninfo
Dump of optimized shaders in scene with incorrect rendering of vertex color. Dump of unoptimized shaders in scene with incorrect rendering of vertex color. vulkaninfo output when using mesa 18.0. |
Description
Ben Clapp
2017-11-22 22:55:41 UTC
I've done some testing with mesa 17.3.0 on my computer with the RX 580 (using the mesa 17.3.0-1 package available in debian unstable). All of the previously mentioned bugs are still present on 17.3.0, so I'm updating the version number for this ticket to 17.3 as well. It's worth noting that since originally reporting this bug, I've switched from GNOME3 to using Cinnamon for my desktop environment. While the GNOME3 bug I mentioned in the aside clearly went away after this change (the screen stopped freezing for 1-2s every time I opened a right-click menu, which simply should not be happening on a TR 1950X CPU), the issue with unusual frame drops to around 30FPS that are not reflected by a drop in the FPS indicator in dolphin persists nonetheless. It seems like this is an issue related to frame presentation, but where exactly the issue lies is unclear. That the issue occurs on both GNOME3 with Wayland and Cinnamon with X suggests the issue may be unrelated to the display server and/or compositor. I don't observe this problem when using other (3D/non-3D) applications (in a game engine I wrote myself in OpenGL, I can get a smooth 120FPS, nor do I see this frame presentation issue when using citra-emu or other applications), so it seems this frame presentation issue is unrelated to radv, and thus it might be worth opening a separate ticket to further investigate. The fact that the issue seems to occur suggests an issue with dolphin itself rather than a mesa/graphics driver issue, however the issue isn't present when using NVIDIA closed-source drivers, so it's hard to determine where the problem lies just looking at the symptoms. >The fact that the issue seems to occur **only when running dolphin**
Bug(s) still present as of 17.3.6. (The issues related to frame drops/frame presentation don't seem to be an issue at this point, but crop setting still results in black screen, incorrect colors, etc. persist) Here's a trace of the intro screen made with vktrace, to make this bug easier to reproduce: https://www.dropbox.com/s/930kl7agbg3jl6o/dolphin.vktrace.xz?dl=0 (2.7M compressed, 163M uncompressed) I hope this is useful given vktrace caveats about replaying on different setups. It replayed and rendered correctly on my Intel Ivy Bridge. Trace was made with vktrace from git 78f1a8149a3a6c9e48b9bd5cff6debc5726d819e For reference, running dolphin as an argument to vktrace didn't work for me, but using it in client/server mode and tracing with VK_INSTANCE_LAYERS=VK_LAYER_LUNARG_vktrace worked nicely. I will retest later on a more up to date version of Mesa/radv. HTH, Bugs still present in 18.0.0. In addition, there's now a new bug when using the radeonsi driver, where the game will sometimes freeze altogether (not a complete GPU hang, you can kill the dolphin-emu process and move on) during normal play that did not occur in 17.3.7. (Since this is a radeonsi-only issue and not radv, might be worth opening a separate ticket for.) There also seems to be a very obvious stuttering issue introduced after the update to 18.0.0 that doesn't seem present in other 3D applications. This stuttering is a bit different from the "frame presentation issue" I described before in that the audio stutters at the same time as the frame drops, and the FPS actually does drop according to dolphin's FPS indicator. Given this stuttering was not present with the exact same system/dolphin build/etc. before updating to mesa 18.0.0, I can only guess there's something weird going on with the driver here that is causing the stuttering issue. The stuttering occurs whether you use dolphin's GL or Vulkan backends. This gets off the topic of the originally reported bug (and perhaps is worth opening a separate ticket over), but more recently I've found I can consistently get GPU hangs when playing certain games other than Melee in dolphin-emu, so it just seems that dolphin is in general using a range of advanced GL/Vulkan features and mesa is tripping over a number of edge-cases that aren't used by most Linux applications. Absolutely file new bugs for each issue. Much easier to close duplicate bugs than tracking more than one problem in a report (should it turn out be the same problem in the end). The freeze issue sounds like a good candidate for git bisect. I might give it a try if I can reproduce and have the time. Today I spent a number of hours looking at the background rendering errors in RenderDoc. The vertex shader outputs some vertices that have two vertex colors, colors_0 and colors_1. (Only colors_0 is relevant here) The fragment shader does a bunch of fiddling around with colors_0, a lot of unnecessary conversions and re-assignments that effectively do nothing, and ultimately the colors_0 value is passed to rastemp and tevin_d. Some more fiddling around and, in the case of the areas of the screen where there are rendering errors, the value of colors_0/rastemp/tevin_d ("tevin" means "TEV input", referring to the Gamecube/Wii's Texture EnVironment hardware) becomes the color value written to the framebuffer. The problem is not in the vertex shader, nor is it in the fragment shader. For some reason, the value of colors_0 coming out of the vertex shader is correct, but the value of colors_0 in the fragment shader is inverted! So blue will appear yellow, black while appear white, etc... This seems to be a driver bug after all, and so I did try to spend some time looking into radv's code to try and see if I could figure out a fix. The issue might lie in radv_pipeline.c, I would think it probably has something to do with the inter-stage varying colors_0 not getting filled or interpreted correctly. I've done lots of OpenGL and Vulkan programming, but I have little experience with the driver side of things, so while it might be interesting to talk a bit with the radv devs and learn a thing or two, I'm not sure how much further I can go in terms of looking into this on my own without assistance. Sven: I'll work on making some separate issues on another occasion for the radeonsi freeze for Melee and the system freezes/GPU hangs for other games. (In reply to Ben Clapp from comment #8) > Today I spent a number of hours looking at the background rendering errors > in RenderDoc. > > The vertex shader outputs some vertices that have two vertex colors, > colors_0 and colors_1. (Only colors_0 is relevant here) > > The fragment shader does a bunch of fiddling around with colors_0, a lot of > unnecessary conversions and re-assignments that effectively do nothing, and > ultimately the colors_0 value is passed to rastemp and tevin_d. > Some more fiddling around and, in the case of the areas of the screen where > there are rendering errors, the value of colors_0/rastemp/tevin_d ("tevin" > means "TEV input", referring to the Gamecube/Wii's Texture EnVironment > hardware) becomes the color value written to the framebuffer. > > The problem is not in the vertex shader, nor is it in the fragment shader. > For some reason, the value of colors_0 coming out of the vertex shader is > correct, but the value of colors_0 in the fragment shader is inverted! So > blue will appear yellow, black while appear white, etc... > > This seems to be a driver bug after all, and so I did try to spend some time > looking into radv's code to try and see if I could figure out a fix. > The issue might lie in radv_pipeline.c, I would think it probably has > something to do with the inter-stage varying colors_0 not getting filled or > interpreted correctly. > > I've done lots of OpenGL and Vulkan programming, but I have little > experience with the driver side of things, so while it might be interesting > to talk a bit with the radv devs and learn a thing or two, I'm not sure how > much further I can go in terms of looking into this on my own without > assistance. > Do you think you could get a dump of the NIR and LLVM IR for the shaders in question and attach it here? You can use the following environment var to dump the shaders: RADV_DEBUG=shaders You also might be able to catch the attention of some devs if you jump on the freenode #radeon IRC channel. Regarding the freeze when using the OpenGL backend with Mesa 18.0, it seems a different user has already reported that bug: https://bugs.dolphin-emu.org/issues/10904 https://bugs.freedesktop.org/show_bug.cgi?id=105339 Apologies for the late response Timothy. >Do you think you could get a dump of the NIR and LLVM IR for the shaders in question and attach it here? You can use the following environment var to dump ?the shaders: RADV_DEBUG=shaders I'm struggling to properly dump the shaders because RADV now has an on-disk shader cache, and RADV_DEBUG=shaders seems to only print out shaders when they are actually compiled for the first time. How can I clear and/or disable the shader cache? >You also might be able to catch the attention of some devs if you jump on the freenode #radeon IRC channel. I'm already lurking in there, but perhaps I'll actually say something over there sometime soon. (In reply to Ben Clapp from comment #10) > Regarding the freeze when using the OpenGL backend with Mesa 18.0, it seems > a different user has already reported that bug: > https://bugs.dolphin-emu.org/issues/10904 > https://bugs.freedesktop.org/show_bug.cgi?id=105339 > > Apologies for the late response Timothy. > > >Do you think you could get a dump of the NIR and LLVM IR for the shaders in question and attach it here? You can use the following environment var to dump ?the shaders: RADV_DEBUG=shaders > I'm struggling to properly dump the shaders because RADV now has an on-disk > shader cache, and RADV_DEBUG=shaders seems to only print out shaders when > they are actually compiled for the first time. > How can I clear and/or disable the shader cache? RADV_DEBUG=nocache or MESA_GLSL_CACHE_DISABLE=1 should do it. Also you can dump the unoptimised LLVM IR with RADV_DEBUG=preoptir which can be useful sometimes. Created attachment 138582 [details]
Dump of optimized shaders in scene with incorrect rendering of vertex color.
Created attachment 138583 [details]
Dump of unoptimized shaders in scene with incorrect rendering of vertex color.
OK, here's your shader dumps attached to the ticket, both optimized and optimized.
There may be some unrelated shaders included in the dump due to the way dolphin/shader dumping works, but not sure there's much I can do about that.
Let me know if you need anything else.
Can you attach your vulkaninfo too? Created attachment 138586 [details]
vulkaninfo output when using mesa 18.0.
I already had attached my vulkaninfo, but that was back when I was using 17.2.x, so here's an updated version.
Thanks, are you still using the same dolphin? If not, can you report the version number, please? (In reply to Samuel Pitoiset from comment #16) > Thanks, are you still using the same dolphin? If not, can you report the > version number, please? Currently using commit dea30e08b for dolphin (was latest commit in master branch about two days ago). Bug still present on 18.0.2. I noticed flickering back and forth between a black screen and the game screen when resizing the window, so I made a short video demonstrating this: https://www.youtube.com/watch?v=W2yuR0-z-EU Hello all, I have some insight and fixes for some of the issues described in this ticket: First, regarding the "black screen when cropping is turned on issue", this can be worked around with the following pull request: https://github.com/dolphin-emu/dolphin/pull/6786 In theory, there shouldn't be anything wrong with negative Y in the viewport, and you can still see black screen flickering when adjusting the window size, but with this change to dolphin's code made, the screen will never remain black after a resize (only flicker for a moment). So this issue is probably still worth investigating on the mesa side at some point. Regarding the strange stuttering issues I was experiencing, this is a CPU-side issue that has nothing to do with mesa. The TR 1950X is essentially two Ryzen chips glued together. The TR 1950X has two memory controllers, and each memory controller is owned by one of the two Ryzen chips. So, for example, I have two 16GB RAM cards plugged into the two memory controllers on my system, and when running "numactl -H", I can see that 16GB of RAM are assigned to each of the two NUMA nodes. It seems that the memory allocator (or maybe the scheduler?) in Linux wasn't properly allocating memory (or maybe processes) to just one of the two physical chips/just one of the RAM cards, and this resulted in stuttering (perhaps due to needing to transfer some memory from one RAM card to the other for use by another process on the other Ryzen chip?) The stuttering can be prevented by using numactl like this: numactl --cpunodebind=0 --membind=0 ./dolphin-emu Does https://patchwork.freedesktop.org/patch/222558/ fix the background rendering for you? Bas, Your patch does fix the issue with incorrect colors :) Thank you very much for your hard work. The black screen issue will still be present on versions of dolphin before the workaround was applied, and even with the workaround, black-screen flickering can be seen when resizing the window. Given this, I would recommend closing this bug ticket and, if it seems worth exploring on the driver side at some point, opening a separate ticket with importance of "low" or "lowest" for the minor issue of black-screen flickering when resizing dolphin's window. I'll go ahead and marked this issue as RESOLVED/FIXED. Bas, I'll leave it to you if you want to open a separate ticket for black-screen flickering when resizing the window. Again, thank you very much for the bugfix! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.