The CTS test KHR-GL45.tessellation_shader.tessellation_shader_tessellation.gl_InvocationID_PatchVerticesIn_PrimitiveID seems to occasionally fail at a rate of about 1.4%.
The test emits a bunch of tessellation patches with varying sizes, including the maximum of 32 and does an instanced draw call. There are no vertex buffers bound and all of the vertices have the same position. The tessellation shaders write to 5 integer output varyings which are captured to a transform feedback buffer. The test then scans the buffer to verify that it has the expected values. When it fails, there are one or two vertices in this buffer that get written with all zeroes instead of the expected values. These failed slots are small and in the middle of the buffer so it doesn’t seem likely that this could be a buffer configuration issue. I suspect it might be some hardware quirk but I haven’t been able to find anything relevant in the workarounds in the hardware docs.
The hardware where I can replicate it is a NUC with Intel(R) Iris Pro 6200 (Broadwell GT3e). However I am unable to replicate it on my laptop which has Intel(R) HD Graphics 5500 (Broadwell GT2).
Created attachment 135663 [details] [review]
Here is a test for Piglit which does something similar to the CTS test and runs it 3000 times. Sadly it still doesn’t seem to fail every time so the failure rate is even lower than the CTS test, but I’m posting it anyway in case the code is a bit easier to follow.
I am still experiencing this failure. Adding the error output:
Invalid gl_InvocationID value (0) was found for result coordinate at index 0 instead of expected value (31).
Fail (Invalid gl_InvocationID value used in TC stage at esextcTessellationShaderTessellation.cpp:1440)
I'm still able to reproduce with current mesa master:
9a21c96126d ("egl/x11: Send invalidate to driver on copy_region path in swap_buffer")
Tested with BDW GT3 in Debian. System specs:
Maybe related to bug 107675 ?
I've run both tests on my BDW (Intel® HD Graphics 5500) machine on Ubuntu 18.04, 4.18.0 and 5.0.0 version of the Kernel, with Xorg 1.20.1 version and the latest master version of mesa.
I've run each test around 100 000 times (I've been running them for 4 nights, 25 000 times per night) - none of them failed.
Also, as I saw, the VK-CTS test successfully passed on the CI - https://mesa-ci.01.org/mesa_master_daily/builds/4946/group/2137555ccb43fc785550447fe4246772.
I think we can close this issue.
Paul: thanks for the additional testing! The test is disabled in CI on BDW for the i965 driver. You are right that it seems to be passing reliably on the BDW iris driver.
I've submitted an MR to re-enable the test on BDW. We should verify that it is reliable in our environment (running parallel tests on each core).
I just tried to run this test through CI on BDW/i965 and saw it fail.
stdout from test (matches what was reported previously):
Invalid gl_InvocationID value (0) was found for result coordinate at index 5 instead of expected value (31).
Invalid gl_InvocationID value used in TC stage at esextcTessellationShaderTessellation.cpp:1440
As I already said, I have Intel® HD Graphics 5500
But Neil has said, that he couldn't replicate the issue on this GPU.
Clayton, could you clarify please, what bdw-GPU on the CI?
I saw it fail intermittently on an Iris Pro Graphics 6200.