Bug 41787 - [llvmpipe] stencil broken
Summary: [llvmpipe] stencil broken
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium minor
Assignee: mesa-dev
QA Contact:
Depends on:
Reported: 2011-10-14 04:27 UTC by Lauri Kasanen
Modified: 2013-05-22 20:59 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:

git llvmpipe (407.28 KB, image/png)
2011-10-14 04:27 UTC, Lauri Kasanen
git softpipe (286.74 KB, image/png)
2011-10-14 04:27 UTC, Lauri Kasanen
glut test program (6.47 KB, text/plain)
2011-10-17 16:56 UTC, Brian Paul
0000000837.fglrx.png (367.72 KB, image/png)
2011-10-20 06:32 UTC, Jose Fonseca
0000000837.llvmpipe.png (367.72 KB, image/png)
2011-10-20 06:32 UTC, Jose Fonseca
0000000837.diff.png (397.33 KB, image/png)
2011-10-20 06:33 UTC, Jose Fonseca
stencil-802-llvmpipe (14.75 KB, image/png)
2012-10-30 19:54 UTC, Jose Fonseca
stencil-802-intel (37.11 KB, image/png)
2012-10-30 19:55 UTC, Jose Fonseca

Note You need to log in before you can comment on or make changes to this bug.
Description Lauri Kasanen 2011-10-14 04:27:25 UTC
Created attachment 52330 [details]
git llvmpipe

In current git (5dddeb7776c), stencil appears to be somewhat broken in llvmpipe.

This can be seen in the stencil shadows in Irrlicht, such as example 08 (screenshot included). Softpipe has correct rendering.

It's a regression from 7.11. Initial bisect showed 7d39ff44a2256a08fac725ae0ee8a4475fbf9de as the culprit, but I'm not sure that's the correct one.
Comment 1 Lauri Kasanen 2011-10-14 04:27:52 UTC
Created attachment 52331 [details]
git softpipe
Comment 2 Lauri Kasanen 2011-10-14 05:09:55 UTC
Apologies, not a 7.11 regression afterall - llvmpipe's broken on 7.11 too.
Comment 3 Jose Fonseca 2011-10-14 06:57:49 UTC

Could you please try to obtain a trace of this issue with https://github.com/apitrace/apitrace , following the instructions in the INSTALL and README.
Comment 4 Brian Paul 2011-10-14 09:01:36 UTC
Just FYI, I tried some conformance and piglit stencil tests and they all passed with llvmpipe.  If it's a stencil bug, it's probably something slightly obscure.
Comment 5 Lauri Kasanen 2011-10-14 10:47:56 UTC
Attaching the apitrace output. glretrace'd it with both softpipe and llvmpipe, results are the same as with the actual app.
Comment 6 Lauri Kasanen 2011-10-14 10:50:40 UTC
hm, hit the size limit. Link:

Comment 7 Jose Fonseca 2011-10-14 11:16:26 UTC
It looks like some sort of depth fighting.

It might be related to fbo-depth-sample-compare

I know that at least the llvmpipe should snap interpolated fragment Z to the depth buffer precision (i.e., 24bits), so that the Z values from the interpolator and depth buffer are consistent, and I had a patch that fixed that, but it didn't fully fix the  fbo-depth-sample-compare so I didn't complete it.

I'll try to search my git stash to see if I find it and if it fixes this issue.
Comment 8 Jose Fonseca 2011-10-16 07:13:18 UTC
I did find my old patch but it fixed nothing.  I'd really like to fix this, but there seems to be something very broken with depth in llvmpipe, that goes beyond the precision lost due to unorm24 <-> float32 conversions.

I've also looked at the trace more closely:

One thing odd about it is that it starts using the stencil buffer without clearing on the first time (it only clears the color and depth buffer). But I'm not sure if that's the problem.

Also the issue can be seen as early as draw call 837.

The first stencil clear is ironically, at call 838.
Comment 9 Jose Fonseca 2011-10-16 07:14:23 UTC
Oh, and I'm not sure if this is depth fighting, or culling. Because the edges of the shadows are visible and apparently correct.
Comment 10 Brian Paul 2011-10-17 16:56:13 UTC
Created attachment 52451 [details]
glut test program

Jose, I extracted some of the commands from the trace into a glut test harness and I think I've reproduced the bug.

A yellow floor quad is drawn first.  Then a sphere is drawn (as a stand-in for a shadow volume) into the stencil buffer then the shadow itself is drawn by drawing a window-sized quad with blending and stencil test.

If you look carefully, there's stray red pixels just below the round shadow.  If you rotate the scene (cursor keys, z, Z) you'll see the stray pixels randomly move around.

If you decrease the sphere tessellation, the number of artifacts decreases.  In the previously attached trace, the shadow volume appears to be very complicated (23424 vertices).  If you make the window smaller, the number of artifacts is less than if the window is large.

It seems as if we're writing stray pixels in the stencil code when drawing very small triangles.  But only when polygon culling is enabled.

Adding an initial glClear(GL_STENCIL_BUFFER_BIT) doesn't make any difference.

If I have time this week I'll try to debug further.
Comment 11 Jose Fonseca 2011-10-20 06:32:09 UTC
Created attachment 52585 [details]
Comment 12 Jose Fonseca 2011-10-20 06:32:47 UTC
Created attachment 52586 [details]
Comment 13 Jose Fonseca 2011-10-20 06:33:13 UTC
Created attachment 52587 [details]
Comment 14 Jose Fonseca 2011-10-20 06:40:29 UTC

I see the stray pixels with your example here, but I'm not sure if it is the same issue.

See the attached screenshots of the trace at call 837, with llvmpipe and fglrx, and a difference map between them.

It looks like only the edges of the shadows were drawn.
Comment 15 Brian Paul 2011-10-20 07:14:10 UTC
I'm not 100% sure it's the same issue either, actually.  But note that I didn't try to reproduce the trace's projection parameters or object coordinates.  If I duplicated those params/coords used in the trace, perhaps the stray pixels would more distinctly correspond to the triangle edges like in the trace.  That might point toward a z-fighting or z-precision type of issue.

In any case, I think it's worth digging into the smaller test case first and see what's going on there.
Comment 16 Jose Fonseca 2012-10-30 19:53:07 UTC
I spent sometime today looking at this.

I modified glretrace to dump the stencil buffer instead of color buffer:
$ git diff
diff --git a/retrace/glstate_images.cpp b/retrace/glstate_images.cpp
index e534a65..75664e6 100644
--- a/retrace/glstate_images.cpp
+++ b/retrace/glstate_images.cpp
@@ -739,6 +739,7 @@ getFramebufferAttachmentDesc(Context &context, GLenum target, GLenum attachment,
 image::Image *
 getDrawBufferImage() {
     GLenum format = GL_RGB;
+    format = GL_STENCIL_INDEX;
     GLint channels = _gl_format_channels(format);
     if (channels > 4) {
         return NULL;

And ran retracediff.py script to compare the stencil buffer contents, on every draw call, between llvmpipe and Intel:

$ ./scripts/retracediff.py -r ./glretrace --src-env LD_LIBRARY_PATH=/home/jfonseca/projects/opengl/mesa/build/linux-x86_64-debug/gallium/targets/graw-xlib:/home/jfonseca/projects/opengl/mesa/build/linux-x86_64-debug/gallium/targets/libgl-xlib /home/jfonseca/projects/opengl/traces/stencil.trace
call	precision
387	36.802488
531	36.802488
622	36.802488
708	36.802488
776	36.802488
802	7.947939
  GL_COLOR_ARRAY_POINTER: 59042800 -> 62751856,
    "GL_TRUE" -> "GL_FALSE",
    "GL_TRUE" -> "GL_FALSE",
    "GL_TRUE" -> "GL_FALSE",
    "GL_TRUE" -> "GL_FALSE"
    1 -> 1.961161,
    0 -> 0.07548513,
    0 -> 0.3849002,
    1 -> 1.962614,
    0 -> -0.3849002,
    0 -> -0.3922323,
    0 -> 0.3774257,
    1 -> 1.924501,
    0 -> -17.65045,
    0 -> -12.45505,
    0 -> 92.37605,
  GL_DEBUG_CALLBACK_USER_PARAM: 36090016 -> 39956640,
  GL_NORMAL_ARRAY_POINTER: 61292368 -> 65157424,
  GL_TEXTURE_COORD_ARRAY_POINTER: 61081232 -> 64931200,
    1 -> 1.961161,
    0 -> 0.07548513,
    0 -> 0.3849002,
    1 -> 1.962614,
    0 -> -0.3849002,
    0 -> -0.3922323,
    0 -> 0.3774257,
    1 -> 1.924501,
    0 -> -17.65045,
    0 -> -12.45505,
    0 -> 92.37605,
  GL_VERTEX_ARRAY_POINTER: 61337392 -> 65378608,

806	5.831233
837	5.831233

So the differences start on call 802. 

The attached stencil-802-intel.png stencil-802-llvmpipe.png show the stencil contents on that call.

It's still not clear whether this is a truly a depth stencil issue, or simply a depth test (polygon offset) problem.
Comment 17 Jose Fonseca 2012-10-30 19:54:46 UTC
Created attachment 69327 [details]
Comment 18 Jose Fonseca 2012-10-30 19:55:21 UTC
Created attachment 69328 [details]
Comment 19 Roland Scheidegger 2013-05-22 20:59:35 UTC
Fixed by 82d7733b52e7c124a268c68395de140641b50c05. Replay still looks a bit odd to me but seems to be the same as with softpipe now, so if it's still wrong it must be another bug.

bug/show.html.tmpl processed on May 28, 2016 at 23:58:31.
(provided by the Example extension).