Certain uses of atomic operations, e.g. open coding atomicAdd by using a loop with loads and atomicCompSwaps, will not work when executed in helper invocations, leading to infinite loops. layout(binding = 0) buffer bufblock { int value; }; ... int f; /* This is an open-coded atomicAdd. */ do { f = value; } while (f != atomicCompSwap(value, f, f + 4)); This code can lead to hangs, since a helper invocation will loop indefinetely, because atomicCompSwap don't write anything and might return junk while the "f = value" reads an real value. "Atomic operations to image, buffer, or atomic counter variables performed by helper invocations have no effect on the underlying image or buffer memory. The values returned by such atomic operations are undefined." https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/gl_HelperInvocation.xhtml Full shader codeis available in attachment of bug https://bugs.freedesktop.org/show_bug.cgi?id=106774.
Thanks for submitting the bug (at my request). I've done some more research on this topic, and a grudgingly agree with Jason that the driver is fine, and the test case is broken. Issue #22 of the GL_ARB_shader_image_load_store extension discusses this very issue, and it suggests a work around. The only problem is the work around requires either GL_ARB_gpu_shader5 or GLSL 4.00 because it needs gl_SampleMaskIn. This doesn't help vertex shaders or other non-fragment shaders. GLSL 4.50 adds gl_HelperInvocation, but that is still limited to the fragment shader stage. (22) If implementations run fragment shaders for fragments that aren't covered by the primitive or fail early depth tests (e.g., "helper pixels"), how does that interact with stores and atomics? RESOLVED: The current OpenGL specification has no formal notion of "helper" pixels. In practice, implementations may run fragment shaders for pixels near the boundaries of rasterized primitives to allow derivatives to be approximated by differencing. Typically, these shader invocations have no effect. While they may produce outputs, the outputs for these pixels will be discarded without affecting the framebuffer. The spec basically treats these shader invocations as though they don't exist. If such a shader invocation performs store or atomic operations, we need to define what happens. In our definition, stores will have no effect, atomics will not update memory, and the values returned by atomics will be undefined. The fact that these invocations don't affect memory is consistent with the notion of helper pixel shader invocations not existing. However, it is possible to write a fragment shader where flow control depends on the (undefined) values returned by the atomic. In this case, the undefined values returned for helper pixels could result in very long execution time (appearing to be hang) or an infinite loop. To avoid hangs in such cases, it is possible to use the fragment shader input sample mask to identify helper pixels: // If the input sample mask is non-zero, at least one sample is // covered and the invocation should be treated as a real invocation. // If the sample mask is zero, nothing is covered and this should be // treated as a helper pixel. If more than 32 samples are supported, // additional words of gl_SampleMaskIn would need to be checked. if (gl_SampleMaskIn[0] != 0) { // "real" pixel, perform atomic operations } else { // "helper" pixel, skip atomics } It may be desirable to formalize the notion of helper pixels in a future addition to the shading language.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.