Summary: | glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x | ||
---|---|---|---|
Product: | Mesa | Reporter: | Stephane Chevigny <stephane.chevigny> |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | normal | ||
Priority: | medium | ||
Version: | 17.2 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | Difference between the result of gl_GlobalInvocationID and gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x; |
Description
Stephane Chevigny
2017-10-21 17:32:14 UTC
I wanted to test this out on nouveau -- I ran the program, and it created output. What's the expected output? The expected output: value[i].x (= gl_GlobalInvocationID.x ;) should be equal to value[i].y and value[i].w (= gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x;) for all indices i according to spec. With the parameter I gave to glDispatchComputeGroupSizeARB, the divergence of the result happen after the indice i = 63 on my computer. So value[i].x = value[i].z (=gl_LocalInvocationID.x;) instead. So on my computer, gl_GlobalInvocationID.x appear to be only equal to gl_LocalInvocationID.x notgl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x or I misread the spec. OK, I see. The relevant shader is fill_cs_450.glsl. It puts stuff into the array with values (gl_GlobalInvocationID.x, t, gl_LocalInvocationID.x, t) where t = gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x Ideally the values printed should be (x, x, l, x) -- i.e. t == gl_GlobalInvocationID.x. With nouveau this outputs a bunch of BS... sometimes gl_GlobalInvocationID.x == gl_LocalInvocationID.x, other times gl_GlobalInvocationID.x == randomish value. I suspect this is a combination of a nouveau bug and a core bug [the latter of which is affecting the radeonsi output]. Going to try to take a look. OK yeah, this is totally broken... for a simplified shader (which I will submit to piglit), this is generating code like COMP PROPERTY CS_FIXED_BLOCK_WIDTH 0 PROPERTY CS_FIXED_BLOCK_HEIGHT 0 PROPERTY CS_FIXED_BLOCK_DEPTH 0 DCL SV[0], THREAD_ID DCL SV[1], BLOCK_ID DCL SV[2], BLOCK_SIZE DCL BUFFER[16] DCL TEMP[0..3], LOCAL IMM[0] UINT32 {8, 0, 0, 0} 0: UMAD TEMP[0].x, SV[1].xyzz, TEMP[1].xyzz, SV[0].xyzz 1: MOV TEMP[1].x, TEMP[0].xxxx 2: MOV TEMP[1].y, SV[0].xxxx 3: UMAD TEMP[2].x, SV[1].xxxx, SV[2].xxxx, SV[0].xxxx 4: UMUL TEMP[3].x, TEMP[2].xxxx, IMM[0].xxxx 5: STORE BUFFER[16].xy, TEMP[3].xxxx, TEMP[1].xyyy 6: END Note the first op which tries to read the uninitialized TEMP[1].xyzz. Oops. Pretty sure this used to work, so probably got broken at some point with the various refactors. Thank you very much for checking that. S.C. https://patchwork.freedesktop.org/patch/184119/ This fixes both your application as well as a small piglit test I wrote on nouveau. The issue was in the core logic. If radeonsi still doesn't work with this, then it has additional issues. An updated patch fixes the issue in a more complete way (with multiple shaders being linked together, even non-variable local sizes were broken). This is now upstream in commit 4d24a7cb97641cacecd371d1968f6964785822e4 Author: Ilia Mirkin <imirkin@alum.mit.edu> Date: Sat Oct 21 15:15:41 2017 -0400 glsl: fix derived cs variables Thanks for reporting and providing a repro! As the original issue was filed against radeonsi, I'm going to leave this open until someone can confirm that this does indeed fix the problem there. values[i].x == values[i].y == values[i].w for all indices i with radeonsi. The issue appears to be fixed, thanks guys! Closing. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.