In my tests I have found this to have a low fail between 1% and 2%. If the test is executed in a loop, then chances of it failing can be significantly increased if we scroll the terminal fast for periods of 10+ seconds.
The problem seems somehow related to gl_InstanceID since every time I got this to fail, the failure happens in subtests that use gl_InstanceID in the shaders (with and without actual instancing draw calls being used).
Interestingly, replacing gl_InstanceID with 0 in the shader code seems to make the problem go away entirely, which is quite surprising seeing that this test has otherwise a very high pass rate and does actual instanced draw calls that woudl rely on gl_InstanceID taking appropriate values.
FWIW, we have tested these in 4 different machines (SKL and KBL), and found that on two of these which had 4.9 kernels we were unable to reproduce the problem, where the other two where the problem existed the kernel was 4.10.
I have confirmed that when this test fails it does because we read completely bogus values of gl_instanceID in the shader.
(In reply to Iago Toral from comment #0)
> Interestingly, replacing gl_InstanceID with 0 in the shader code seems to
> make the problem go away entirely, which is quite surprising seeing that
> this test has otherwise a very high pass rate and does actual instanced draw
> calls that woudl rely on gl_InstanceID taking appropriate values.
I have tracked this part where we can make the test pass consistently when we replace gl_InstanceID by 0 down to a bug in CTS that was only validating the first instance worth of results (See gerrit CL#2037 for a patch).
Another thing I noticed is that if I modify the vertex shader in these tests to also use gl_VertexID besides gl_InstanceID, for example replacing gl_InstanceID in gl4cVertexAttrib64bitTest.cpp:1826 with this expression:
(gl_InstanceID + gl_VertexID * 2 - gl_VertexID - gl_VertexID)
so the end result is the same but it makes the compiler read gl_VertexID, then I can no longer reproduce the problem, even if I do the scrolling thing I mention in the first comment which most definitely makes the issue reproducible much more easily.
Based on this, I thought there may be some piece of state that is only set when we resad gl_VertexID that could be affecting the test but setting prog_data->uses_vertexid when we set prog_data->uses_instanceid doesn't seem to fix the problem and I did not identify any other place in the code where reading gl_VertexID affects state setup...
It seems that programming the hardware to include a dummy SGVS vertex element fixes the problem.
In theory, this is not required since gen8 because 3DSTATE_VF_SGVS allows us to store VertexID and InstanceID in an element beyond the last vertex element programmed, but just doing this seems to get rid of the spurious fails completely, at least on my KBL laptop.
I'll do some more testing and send an RFC patch for review if I don't find any issues. It could be that this is just papering over something else though.
Author: Iago Toral Quiroga <email@example.com>
Date: Thu Jan 4 03:55:13 2018 +0100
i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.
- Do this for gen8+, not just gen9+, and pull the boolean
outside the #if block (Jason)
Fixes flaky test:
Reviewed-by: Jason Ekstrand <firstname.lastname@example.org>