Created attachment 144153 [details]
the bug itself on dota2 with vulkan enabled
I am experiencing a RADV bug with both arch and Gentoo linux, on LLVM7+ (cant test on LLVM 6)
i can't reproduce it with Ubuntu 18.10(strangely it works flawlessly there, i didnt test with 19.10 but i can give it a go if its needed) on the same system with the same hardware specs, i tried nuking Gentoo and installing arch to see if the bug was a Gentoo specific one and it wasn't, i reinstalled Gentoo 2 times while testing it (with different use flags) and wasn't able to stop that to happening
switching LLVM versions
switching to arch
changing compiler flags
changing around with the debug enable flag on mesa
downgrading glibc a bit
forcing the game to use wayland directly on SDL (in the case of artifact)
The games I can reproduce are mainly Source 2 ones, but i can reproduce it with skyrim (dxvk dx11 version on proton)
i'm putting it as minor severity as no one else that i asked that haves the same gpu hardware besides me can reproduce the issue
RenderDoc Capture (quite old at this point, i might make a new one tomorrow)
What LLVM/Mesa versions are you using?
Can you attach the output of glxinfo (or vulkaninfo) please?
Please ignore my previous comment, you posted it already. :-)
Created attachment 144172 [details]
Attached screenshot with:
OpenGL renderer string: AMD Radeon (TM) RX 480 Graphics (POLARIS10, DRM 3.27.0, 4.20.0-rc3-58450-g9698024e8a19, LLVM 8.0.1)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.1.0-devel (git-b98955e128)
What's the problem actually? It looks good to me.
(Also tried with mesa-git/llvm-git, same output)
Created attachment 144180 [details]
screnshot taken with scrot
The strange thing for me is that the renderdoc capture looks 100% fine on ubuntu or in a friend's computer that haves similar hardware with arch, but on my PC both arch and gentoo, the renderdoc AND the game itself produces that blocky 100% black
here's an screenshot I've taken just now of how it looks on my side, now I'm on the latest Mesa commit and the latest LLVM commit
i'm trying to do another RenderDoc capture but i'm not being able to compile renderdoc due to a random dns issue right now
This is indeed weird.
I experienced a visually similar issue in witcher3 after updating my gentoo system with a RX 570. Notable changes:
- installed mesa 19.1 (from 19.0.x, not sure which exact version)
- updated wine to 4.10 (from 4.6 iirc)
- migrated the gentoo profile from 17.0 to to the 17.1 (including all recommended rebuilds)
Switching back to mesa 19.0.6 does not fix the issue, but setting the RADV_DEBUG environment variable to "nohiz" does. Maybe that helps to identify the issue?
Can you record a renderdoc capture of the problem please?
Created attachment 144898 [details]
Some notable parts of a renderdoc capture
I tried to get a renderdoc capture, but the results are way too large for my internet connection to upload. I thus took a capture with and without nohiz, and compared the two for obvious differences. I still have the captures, so if there is anything specific I should look at I can do that.
The two things that stood out for me the most (refer to the attached zip for the images I refer to):
1. DS=Store / DS=Load inconsistencies
In the first pass yielding visually different results, there are a few instances of the following sequence:
> vkCmdEndRenderPass(DS=Store) | a
> ... | b
> vkCmdEndRenderPass(DS=Load) | c
> vkCmdDrawIndexed | d
In the good render (nohiz set), the DS keeps its content throughout a-c, and is updated in d. In the bad render (nohiz not set), the DS gets weird fragments in b, which usually go away in d – but in at least one case they did *not* go away, causing the resulting texture to become visually corrupted (many pixels become white, see 0_depth_attachment_after_load_*).
2. The final depth buffer is obviously wrong
The blocks visible on the final output are visible as white (near) in the final depth pass. This seems to block the skybox/background from being added later on (see 1_final_depth_*).
3. It seems that the good and bad render use slightly different render paths
While I could map most parts of the captures onto each other, there were some additional render passes here and there that did not have clear equivalents in the other capture – and are not obviously related to the different camera position, etc.
4. The captures seem to be very hardware-dependent
For some reason, the nohiz capture only works if I set nohiz, otherwise renderdoc claims that I use different hardware and I thus cannot view the capture. Not sure if that is intentional, or if that is some clue that something is wrong.
The final render results are 2_final_result_*. Note that even without nohiz, some frames render properly – suggesting that the issue could be some kind of data race, maybe between DS=Store/DS=Load?
Hope these comments help at least a little, sorry that I'm unable to provide the whole captures.