Created attachment 121653 [details] INTEL_DEBUG=cs shared atomic add All 48 dEQP tests matching '*shared_var*atomic*' currently fail (with kernel 4.4.1, mesa git + Ken's compute state fixing series). This is how I run it: MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/home/ilia/install/lib ./deqp-gles31 --deqp-visibility=hidden --deqp-case='*shared_var*atomic*' This accounts for almost all of the shared_var failures. There are also these 2, which are probably unrelated, but figured I'd mention just in case: Test case 'dEQP-GLES31.functional.compute.basic.shared_var_multiple_invocations'.. Compute shader compile time = 0.448000 ms Link time = 2.081000 ms Test case duration in microseconds = 4210 us Fail (Comparison failed for Output.values[1]) Test case 'dEQP-GLES31.functional.compute.basic.shared_var_multiple_groups'.. Compute shader compile time = 0.446000 ms Link time = 2.528000 ms Test case duration in microseconds = 4595 us Fail (Comparison failed for Output.values[0]) Included is the disassembly of one of the atomic fails, in case it's useful.
One additional observation: the (wrong) count of group 0 (after which it stops comparing) is different every time - tends to cycle between a few different values. My suspicion is that there's something execmask-related going on. Right now we always use 0xffff as the execmask arg for all the untyped surface reads/writes/atomics, as supplied by fs_builder::sample_mask_reg(), but e.g. the HSW prm has very difficult to understand explanation of how the exec mask should be computed (page 832, Execution Masks). I wonder if data is being picked up from threads that are logically "off". The shader in question does a 4x4x4 grid of 3x2x1 blocks.
Created attachment 121849 [details] simple shader test that exposes the issue New theory: the shared memory isn't actually per-workgroup (even though it should be). Play around with the attached shader test, varying the local size, as well as the grid dimensions. [The result of the counter should be == the product of the grid dimensions.]
Patch "i965/hsw: Initialize SLM index in state register" sent. https://patchwork.freedesktop.org/patch/74671/
*** Bug 94255 has been marked as a duplicate of this bug. ***
Fixed on master: commit a100a57e30010da49c96f84a661cec9c57f9eebe i965/hsw: Initialize SLM index in state register
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.