Bug 106902 - Certain uses of atomic operations causes infinite loop due to helper invocations
Summary: Certain uses of atomic operations causes infinite loop due to helper invocations
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-12 18:57 UTC by Caio Marcelo de Oliveira Filho
Modified: 2018-06-13 00:00 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Caio Marcelo de Oliveira Filho 2018-06-12 18:57:18 UTC
Certain uses of atomic operations, e.g. open coding atomicAdd by using a loop with loads and atomicCompSwaps, will not work when executed in helper invocations, leading to infinite loops.

    layout(binding = 0) buffer bufblock {
        int value;
    };

...

        int f;

        /* This is an open-coded atomicAdd. */
        do {
            f = value;
        } while (f != atomicCompSwap(value, f, f + 4));

This code can lead to hangs, since a helper invocation will loop indefinetely, because atomicCompSwap don't write anything and might return junk while the "f = value" reads an real value.

"Atomic operations to image, buffer, or atomic counter variables performed by helper invocations have no effect on the underlying image or buffer memory. The values returned by such atomic operations are undefined."

https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/gl_HelperInvocation.xhtml

Full shader codeis available in attachment of bug https://bugs.freedesktop.org/show_bug.cgi?id=106774.
Comment 1 Ian Romanick 2018-06-13 00:00:43 UTC
Thanks for submitting the bug (at my request).  I've done some more research on this topic, and a grudgingly agree with Jason that the driver is fine, and the test case is broken.

Issue #22 of the GL_ARB_shader_image_load_store extension discusses this very issue, and it suggests a work around.  The only problem is the work around requires either GL_ARB_gpu_shader5 or GLSL 4.00 because it needs gl_SampleMaskIn.  This doesn't help vertex shaders or other non-fragment shaders.  GLSL 4.50 adds gl_HelperInvocation, but that is still limited to the fragment shader stage.

    (22) If implementations run fragment shaders for fragments that aren't
         covered by the primitive or fail early depth tests (e.g., "helper
         pixels"), how does that interact with stores and atomics?

      RESOLVED:  The current OpenGL specification has no formal notion of
      "helper" pixels.  In practice, implementations may run fragment shaders
      for pixels near the boundaries of rasterized primitives to allow
      derivatives to be approximated by differencing.  Typically, these shader
      invocations have no effect.  While they may produce outputs, the outputs
      for these pixels will be discarded without affecting the framebuffer.
      The spec basically treats these shader invocations as though they don't
      exist.

      If such a shader invocation performs store or atomic operations, we need
      to define what happens.  In our definition, stores will have no effect,
      atomics will not update memory, and the values returned by atomics will
      be undefined.  The fact that these invocations don't affect memory is
      consistent with the notion of helper pixel shader invocations not
      existing.

      However, it is possible to write a fragment shader where flow control
      depends on the (undefined) values returned by the atomic.  In this case,
      the undefined values returned for helper pixels could result in very
      long execution time (appearing to be hang) or an infinite loop.  To
      avoid hangs in such cases, it is possible to use the fragment shader
      input sample mask to identify helper pixels:

        // If the input sample mask is non-zero, at least one sample is
        // covered and the invocation should be treated as a real invocation.
        // If the sample mask is zero, nothing is covered and this should be
        // treated as a helper pixel.  If more than 32 samples are supported,
        // additional words of gl_SampleMaskIn would need to be checked.
        if (gl_SampleMaskIn[0] != 0)  {
          // "real" pixel, perform atomic operations
        } else {
          // "helper" pixel, skip atomics
        }

      It may be desirable to formalize the notion of helper pixels in a future
      addition to the shading language.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.