Summary: | [apitrace] Graphical artifacts in Civilization VI on RX Vega | ||
---|---|---|---|
Product: | Mesa | Reporter: | Zach Tibbitts <zachtib> |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | normal | ||
Priority: | medium | CC: | gediminas, jason, matombo, michael.mansell, stevenvandenbrandenstift, t_arceri, zachtib |
Version: | 17.3 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 77449 | ||
Attachments: |
image showing the bad triangles
Hack around issue Renderdoc capture |
Description
Zach Tibbitts
2018-01-12 15:58:09 UTC
I am seeing the same with my vega64 on KDE Neon with the oibaf ppa. packages are marked as 17.4~git1801200730.436ed6~oibaf~x kernel: 4.15.0-041500rc5-generic You can easily reproduce this by running the graphics benchmark. I tried to capture an API trace but the game slowed down so much that the benchmark did not get far enough along to show it :( (it is entirely possible that I am holding it wrong too) Created attachment 136866 [details]
image showing the bad triangles
These triangles are not static, they flicker around the screen a fair bit
So I was able to get an API trace of the problem! (I am rather happy to now be holding apitrace correctly!) it does weigh in at 1.5GiB compressed but it does show all the badly rendered triangles and how zooming in stops them it is available here: https://jasonplayne.com/share/Civ6.trace.bz2 I hope this is helpful *** Bug 105353 has been marked as a duplicate of this bug. *** I have the same issue, i captured a video showing the flickering triangles which i had posted in the duplicate bug 105353 https://bugs.freedesktop.org/attachment.cgi?id=137806 Can Confirm that this problem still persists on Kernel 4.16.1 and Mesa 18 This issue is also persisting with the latest update to Civ VI, including the Rise and Fall expansion. Just want to comment that this issue is still occuring on mesa 18.2, arch linux kernel 4.18.6. I'm not sure why yet but both the black triangles and the incorrect rendering behind the chinese emperor on the loading screen go away when I run the trace on the NIR backend. Until we figure out what is going on here you can try running the game with the following environment variable: R600_DEBUG=nir (In reply to Timothy Arceri from comment #9) > I'm not sure why yet but both the black triangles and the incorrect > rendering behind the chinese emperor on the loading screen go away when I > run the trace on the NIR backend. > > Until we figure out what is going on here you can try running the game with > the following environment variable: > > R600_DEBUG=nir Can confirm! The initial red+black triangles in the game seem to have disappeared using the nir backend. (I may be a little excited!) (In reply to Jason Playne from comment #10) > (In reply to Timothy Arceri from comment #9) > > I'm not sure why yet but both the black triangles and the incorrect > > rendering behind the chinese emperor on the loading screen go away when I > > run the trace on the NIR backend. > > > > Until we figure out what is going on here you can try running the game with > > the following environment variable: > > > > R600_DEBUG=nir > > Can confirm! The initial red+black triangles in the game seem to have > disappeared using the nir backend. > > (I may be a little excited!) After a couple of hours of game time, I have not seen the triangles. nir solves the problem Thanks Timothy! This fixes the issue for me as well. Its great to finally have a work-around! As this is fixed, and published in mesa 18.2.2, I'm closing it. (In reply to Juan A. Suarez from comment #13) > As this is fixed, and published in mesa 18.2.2, I'm closing it. This isn't fixed yet :) Another bug was found using the trace from this bug, as per the commit message that fix does not fix the primary issue of back triangles from this bug. Created attachment 142313 [details] [review] Hack around issue I've found the source of the problem. It seems that the tgsi indirect indexing optimisation is causing issues on Vega for some reason. I've attached a hack which disables it resulting in correct rendering. Created attachment 142314 [details]
Renderdoc capture
Also attaching a renderdoc capture of the issue.
After talking this over with Marek here is a summary of the problem. LLVM's VGPR indexing code on gfx9+ is broken for immediate arrays. Usually this is not a problem as GLSL IR in mesa will lower these to Uniforms via lower_const_arrays_to_uniforms(). However this does not work for the shaders in Civ6 because these arrays are not actually defined as constant arrays for example the original shader looks like this: vec4 x0[3]; vec4 x1[6]; vec4 x2[6]; vec4 x3[6]; x0[0].xy = vec2(0.031250, 0.500000); x0[1].xy = vec2(0.968750, 0.031250); x0[2].xy = vec2(0.968750, 0.968750); x1[0].xy = vec2(1.000000, 1.000000); x1[1].xy = vec2(0.000000, 1.000000); x1[2].xy = vec2(-1.000000, 0.000000); x1[3].xy = vec2(-1.000000, -1.000000); x1[4].xy = vec2(0.000000, -1.000000); x1[5].xy = vec2(1.000000, 0.000000); x2[0].xy = vec2(1.000000, -1.000000); x2[1].xy = vec2(2.000000, 1.000000); x2[2].xy = vec2(1.000000, 2.000000); x2[3].xy = vec2(-1.000000, 1.000000); x2[4].xy = vec2(-2.000000, -1.000000); x2[5].xy = vec2(-1.000000, -2.000000); Without SSA I don't see any way for GLSL IR to easily recognise this as a constant array. Unfortunately by the time LLVM is done it is recognised and is exposed to the buggy indexing support. Video of problem happening: https://www.youtube.com/watch?v=E4oy8tqaYs0 My set up: Ubuntu 18.10, AMD Ryzen 2700X, RX Vega 56 This is the workaround that worked for me: * Beyond adding "R600_DEBUG=nir" to game launcher options (in steam client), also start "Steam Launcher" in a terminal with "R600_DEBUG=nir" environment variable set. And it will work. When bugfix is published let me know to test it. I also confirm success on Ryzen 2200g system. Steam launch property `R600_DEBUG=nir %command%` fixed texture flickering/artifacts in CIV6 (that appeared on strategic map). olly@ryzen-pc1:~$ uname -a Linux ryzen-pc1 5.0.0-050000-generic #201903032031 SMP Mon Mar 4 01:33:18 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux olly@ryzen-pc1:~$ glxinfo | grep "OpenGL version" OpenGL version string: 4.5 (Compatibility Profile) Mesa 18.3.3 olly@ryzen-pc1:~$ lspci -nnk | grep -i VGA -A2 38:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c8) Subsystem: Micro-Star International Co., Ltd. [MSI] Vega [Radeon Vega 8 Mobile] [1462:7a39] Kernel driver in use: amdgpu olly@ryzen-pc1:~$ lsb_release -a No LSB modules are available. Description: Ubuntu 18.04.2 LTS I wanted to make sure that improving the NIR path to reach parity with TGSI in local variable handling wouldn't break things, so I investigated this a bit more. It seems this is triggered by the fact that on Vega the TGSI path always uses scratch, even for smaller local arrays. This bloats the scratch space used by the VS in question. There are three back-to-back draw calls with this VS (used to build up the map), each using scratch, and it seems that radeonsi doesn't properly wait for each call to be done before starting the next and reuses the same scratch buffer, resulting in the threads from one draw call overwriting the scratch of the previous call. Hacking si_update_spi_tmpring_size() to always allocate a new scratch buffer "fixes" the black triangles. (In reply to Connor Abbott from comment #20) > I wanted to make sure that improving the NIR path to reach parity with TGSI > in local variable handling wouldn't break things, so I investigated this a > bit more. It seems this is triggered by the fact that on Vega the TGSI path > always uses scratch, even for smaller local arrays. This bloats the scratch > space used by the VS in question. There are three back-to-back draw calls > with this VS (used to build up the map), each using scratch, and it seems > that radeonsi doesn't properly wait for each call to be done before starting > the next and reuses the same scratch buffer, resulting in the threads from > one draw call overwriting the scratch of the previous call. Hacking > si_update_spi_tmpring_size() to always allocate a new scratch buffer "fixes" > the black triangles. Thanks heaps for looking into the issue Conner. Looking at the explanation on what was happening makes it sound simple - I am sure the debugging effort was far greater! <3 Connor, the hardware manages the scratch buffer alloc/dealloc. You don't have to allocate more than one. The problem with Civ VI is that VGPR indexing has never been properly implemented for gfx9 in LLVM. Connor, there is indeed an issue with how we set SPI_TMPRING_SIZE and same for compute. (In reply to Marek Olšák from comment #23) > Connor, there is indeed an issue with how we set SPI_TMPRING_SIZE and same > for compute. I wonder if this is the issue reported in bug #108194 This might help: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1714/diffs?commit_id=b991b7dd54a899d0df89c809c936401baa341d9d An issue similar occurs when running Civ6 on Virgl. Is there any to disable TGSI indirect indexing for testing purposes? |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.