Created attachment 65066 [details]
program to reproduce the bug
Bug detailed description & Reproduce:
When use msaa framebuffer to do rendering, the output will be incorrect if both of framebuffer and source color's alpha value are not 1.
I have put the program on attachment, which can easily reproduce the bug. You just need to compile and run it on Linux. It renders a green rectangle on black background, but some output error appears on the diagonal.
Thanks for the test case. I can reproduce this bug on my Ivy Bridge machine. I'll start investigating.
Ok, here's what I've figured out so far. Mostly these are notes to myself, since I may have to put this bug on the back burner for a few days while I work on other tasks.
1. The bug only occurs on Ivy Bridge when compressed multisampling format (CMS) is used. Disabling CMS works around the problem. Sandy Bridge is unaffected.
2. Both 4x and 8x oversampling are affected.
3. The attached program draws a square using GL_POLYGON mode. The hardware rasterizes this as two triangles. If I alter the test to draw the two triangles separately, the bug still occurs (even if I force the batch buffer to be flushed in between). This suggests that we're not looking at any sort of caching problem or race condition.
4. The incorrect rendering occurs while drawing the first triangle. Pixels that are completely covered by the triangle are always rendered correctly. Pixels that are incompletely covered are rendered incorrectly about 50% of the time. The incorrectly rendered pixels are rendered too brightly: instead of alpha blending between the old pixel color (0, 0, 0) and the new pixel color (0, 1, 0) to produce an intermediate color (0, 0.7, 0) color, the hardware seems to be blending between the intermediate color and the new color, to produce (0, 0.91, 0). A possible explanation of this behaviour is that the CC unit isn't correctly reading source color values from CMS layer 1 (but CMS layer 0 works ok).
5. Pixels whose Y coordinate (mod 4) is 0 or 1 always seem to be rendered correctly. Pixels whose Y coordinate (mod 4) is 2 or 3 always seem to be rendered incorrectly, with the exception of the first row of the triangle, which has a Y coordinate (mod 4) of 3, yet is rendered correctly. A possible explanation of this behaviour is that when reading source color values from CMS layer 1, bit 1 of the Y coordinate is dropped, so the
6. Disabling SIMD16 rendering doesn't change the output at all.
7. Swapping the order in which the triangles are drawn has a dramatic effect. If the test draws the lower right triangle first, many pixels are wrong. If the test draws the upper left triangle first, only a few pixels are wrong.
In the worst case, we can always work around the problem by disabling compressed multisampling, but since that has performance implications I'd rather save that for a last resort. I'm hoping that we're either programming a bit wrong somewhere (leading to nondeterministic behaviour), or there's a minor hardware bug with an easy workaround.
I will continue investigating over the next several days. The next thing I plan to do is to add this test case to Piglit, and try it out under the simulator. That should give some clues as to whether we're looking at a hardware or a software bug.
- I've sent a patch to the Piglit mailing list that replicates this bug (msaa: Verify proper operation of alpha blending). The patch is based on the original bug report, but tweaked a little to use fewer deprecated GL features. It reliably reproduces the failure on Ivy Bridge systems.
- I've verified that the problem does *not* occur in the simulator.
- I'm in the process of getting in touch with some other folks at Intel to try to track down exactly what is going wrong.
Thanks Paul for your update.
Yes, I have checked that disabling CMS works around the problem.
But as you said, disabling CMS causes performance implication, so we
are looking forward to your final fix.
*** Bug 54529 has been marked as a duplicate of this bug. ***
*** Bug 71195 has been marked as a duplicate of this bug. ***
Patch sent to mesa-dev for review: http://lists.freedesktop.org/archives/mesa-dev/2013-November/047685.html
Great to see a patch so soon! Is the buffer read/write issue a hardware fault? Will the patch result in a performance hit for AA applications?
(In reply to comment #8)
> Great to see a patch so soon! Is the buffer read/write issue a hardware
> fault? Will the patch result in a performance hit for AA applications?
I believe this is a hardware fault. Fortunately I have better contacts in the hardware architecture group than I had when this bug was first discovered last year, so I've pinged them to see if I can get confirmation of that.
Yes, the patch will result in a performance hit for MSAA applications, but only for MSAA buffers that they perform alpha blending in.
I see from mailing list a new patch has been developed relating to an incorrect parameter having been set?
Fixed in commit b4c3b833ec8ec6787658ea90365ff565cd8846c7
Author: Paul Berry <firstname.lastname@example.org>
Date: Tue Nov 12 10:55:18 2013 -0800
i965: Fix vertical alignment for multisampled buffers.
From the Sandy Bridge PRM, Vol 1 Part 1 126.96.36.199 (Alignment Unit
j [vertical alignment] = 4 for any render target surface is
From the Ivy Bridge PRM, Vol 4 Part 1 188.8.131.52 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:
This field is intended to be set to VALIGN_4 if the surface was
rendered as a depth buffer, for a multisampled (4x) render target,
or for a multisampled (8x) render target, since these surfaces
support only alignment of 4.
Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.
Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
Reviewed-by: Kenneth Graunke <email@example.com>
Reviewed-by: Eric Anholt <firstname.lastname@example.org>
From spec seems as if alignment should be 4 for Ivybridge and Sandybridge. How come then only Ivybridge hardware gives effect to the problem? Did the code only incorrectly set the parameter for Ivybridge?
(In reply to comment #12)
> From spec seems as if alignment should be 4 for Ivybridge and Sandybridge.
> How come then only Ivybridge hardware gives effect to the problem? Did the
> code only incorrectly set the parameter for Ivybridge?
The code was setting the alignment incorrectly for both Ivy Bridge and Sandy Bridge, and my fix causes it to set it correctly for both Ivy Bridge and Sandy Bridge.
I suspect that the reason we weren't seeing problems on Sandy Bridge is that the only effect of the vertical alignment on MSAA buffers is in the determination of where buffer layers other than layer 0 are located. On Sandy Bridge, the only multisampled buffers that have layers are those created using TexImage3DMultisample. On Ivy Bridge, all multisampled buffers are layered, since there is one layer per sample.
Also, on Ivy Bridge, the problem was only showing up with compressed multisample buffers. It's possible that there were also problems with uncompressed multisample buffers, but they were more subtle. In any case, compressed multisample buffers aren't supported on Sandy Bridge, so this may explain why we didn't see a problem on Sandy Bridge.