Bug 105464 - Reading per-patch outputs in Tessellation Control Shader returns undefined values
Summary: Reading per-patch outputs in Tessellation Control Shader returns undefined va...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/radeon (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-12 19:14 UTC by Philip Rebohle
Modified: 2018-05-25 11:28 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch for tessellation demo to reproduce the issue (1.29 KB, patch)
2018-03-12 19:14 UTC, Philip Rebohle
Details | Splinter Review
Witcher 3 hull shader which may suffer from the same issue (22.10 KB, application/octet-stream)
2018-03-12 19:38 UTC, Philip Rebohle
Details
Tessellation demo screenshot (352.68 KB, image/png)
2018-03-12 20:50 UTC, Philip Rebohle
Details
Tessellation demo screenshot with LLVM-svn (610.33 KB, image/png)
2018-03-14 09:58 UTC, Philip Rebohle
Details
radv testing patch (5.69 KB, patch)
2018-05-24 12:03 UTC, Nicolai Hähnle
Details | Splinter Review

Description Philip Rebohle 2018-03-12 19:14:40 UTC
Created attachment 138037 [details] [review]
Patch for tessellation demo to reproduce the issue

As mentioned in the title, reading a per-patch output variable that was written by a different invocation produces undefined results even after a barrier() call. 

The attached patch changes the tessellation control shader of the 'tessellation' demo from Sascha Willems' Vulkan samples in that it reads the tessellation levels from a per-patch output array. Note that the shader needs to be recompiled manually in order to reproduce the issue.
Comment 1 Philip Rebohle 2018-03-12 19:38:09 UTC
Created attachment 138038 [details]
Witcher 3 hull shader which may suffer from the same issue

FWIW, the tessellation demo works correctly with AMDVLK with the patched shader. On RADV, the tessellation levels are seemingly random, and its behaviour changes by just changing the location number.

A similar issue occurs in The Witcher 3 when run with DXVK, where water surfaces have incorrect tessellation factors applied on RADV. It reportedly renders correctly on Nvidia.

In this case however, there is only one single invocation writing per-patch outputs. A workaround that makes this particular shader work correctly on RADV is to write all per-patch outputs to an array with storage class Private first, reading the tessellation factors from that array, and finally copying the contents of the temporary array to the output array, which tells me that reading from the output array again returns incorrect results.
Comment 2 Philip Rebohle 2018-03-12 20:50:44 UTC
Created attachment 138044 [details]
Tessellation demo screenshot
Comment 3 Dave Airlie 2018-03-14 00:26:33 UTC
https://patchwork.freedesktop.org/series/39918/

Should fix this, thanks for the report and reproducer.
Comment 4 Philip Rebohle 2018-03-14 09:58:59 UTC
Created attachment 138092 [details]
Tessellation demo screenshot with LLVM-svn

Can confirm that this fixes the tessellation demo on Polaris with LLVM 5.0.1. There are still issues when building against LLVM-svn (see attached screenshot).

Not sure if this is in any way related, but it looks like the TES is receiving incorrect/random data now, which can be reproduced even with the original TCS (without my patch) *and* without the TCS. This may not be reproducible on Vega.
Comment 5 Philip Rebohle 2018-03-14 09:59:45 UTC
Typo: *and* without the TCS fix.
Comment 6 Clément Guérin 2018-03-15 05:08:26 UTC
I can confirm that the tessellation demo is broken without Philip's patch on mesa 03e37ec6d7 and llvm-svn 327550 on R9 Fury.
Comment 7 Samuel Pitoiset 2018-03-28 09:12:26 UTC
A possible fix https://reviews.llvm.org/D44974
Comment 8 Samuel Pitoiset 2018-04-09 17:12:51 UTC
Should be fixed with SVN r329591.
Comment 9 Samuel Pitoiset 2018-04-13 07:29:19 UTC
Note that r329591 has been reverted because it broke two AMDGPU tests. Fixed version has been pushed, r329764 fixes the issue.
Comment 10 FarhanaAleen 2018-04-18 23:15:47 UTC
Hi Samuel,

We are observing performance regression on SGEMM on fiji with https://reviews.llvm.org/D44974. Shortening the vector length from 128 to 64 is causing the regression.

I wanted to analyze the stability regression that you observed with 128bit vector length. Is there any reproducer? Or can you please tell me where I can get tessellation shader and the steps to run it?

Thanks-
Farhana
Comment 11 Samuel Pitoiset 2018-04-19 07:22:27 UTC
Hi,

Well, you can still re-enable the previous behaviour with "enable-ds128" and I expect performance to be similar.

Otherwise, if you want to reproduce the problem you have to clone https://github.com/SaschaWillems/Vulkan and to run the tessellationshader demo using RADV.
Comment 12 Nicolai Hähnle 2018-05-24 12:03:30 UTC
Created attachment 139738 [details] [review]
radv testing patch

Hey Samuel,

I re-tested this (on Tonga) with the latest LLVM trunk and the attached patch. Using RADV_DEBUG=nocache,shaders,ds128 shows assembly that uses the 128-bit instructions without rendering artifacts.

Looks like we could try enabling this in LLVM by default again, any objections?
Comment 13 Samuel Pitoiset 2018-05-24 13:56:35 UTC
The tessellation demo looks good on vega here. What changed in LLVM since last time?

If you really want to enable this by default, I would like to run CTS before.
Comment 14 Nicolai Hähnle 2018-05-24 14:18:02 UTC
I don't know. Farhana asked me to test this since I have a Mesa setup.

I'd appreciate if you could give a CTS run a shot.
Comment 15 Samuel Pitoiset 2018-05-24 14:37:50 UTC
Yeah, I will do.
Comment 16 Samuel Pitoiset 2018-05-25 11:20:32 UTC
No CTS regressions, I'm fine with it.
Comment 17 Nicolai Hähnle 2018-05-25 11:28:40 UTC
Great, thanks for testing!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.