Bug 70601 - [SNB Bisected]Piglit spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float fails
[SNB Bisected]Piglit spec/ARB_texture_float/multisample-formats 2 GL_ARB_text...
Status: VERIFIED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
unspecified
All Linux (All)
: high major
Assigned To: Paul Berry
Intel 3D Bugs Mailing List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-18 08:19 UTC by lu hua
Modified: 2013-12-02 07:13 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Hack that forces I32F and L32F to be treated as R32F (739 bytes, patch)
2013-11-27 08:04 UTC, Kenneth Graunke
Details | Splinter Review
Test output for I32F -> R32F blit (SNB) - blocky and bad (19.48 KB, image/png)
2013-11-27 08:05 UTC, Kenneth Graunke
Details
Test output with I32F forced to R32F (SNB) - correct (12.65 KB, text/plain)
2013-11-27 08:05 UTC, Kenneth Graunke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description lu hua 2013-10-18 08:19:41 UTC
System Environment:
--------------------------
Platform: Sandybridge
Libdrm:		(master)2.4.47
Mesa:		(master)4ef1c8fb4c0ae8d0fa2a0e4311ef63255d8485f0
Xserver:	(master)xorg-server-1.14.99.2-14-g7cf1b595c8c8f9776a39559d2878cf90af3f2859
Xf86_video_intel:(master)2.99.904-32-gec0866e86d365ae3fd9790b1b263d49fc4981220
Cairo:		(master)6f05ecf488314e4b0c6c6b0110963c449bebe7d7
Libva:		(staging)1a011ce5bb0b80506797a25a988854f3f81ce909
Libva_intel_driver:(staging)1cee858036a87837deddc87586701ed869f96261
Kernel:	(drm-intel-nightly) 164a4cb4c1431a0689f85507868356fae24da638

Bug detailed description:
-----------------------------
It fails on sandybridge with mesa master branch, and works well on 9.2 branch.
spec_ARB_texture_float_multisample-formats_4_GL_ARB_texture_float also fail with same bisect commit.

Bisect shows:72aade48fe7d4c4099ef713887c06b3aaacf1acd is the first bad commit.
commit 72aade48fe7d4c4099ef713887c06b3aaacf1acd
Author:     Kenneth Graunke <kenneth@whitecape.org>
AuthorDate: Mon Oct 7 12:44:01 2013 -0700
Commit:     Kenneth Graunke <kenneth@whitecape.org>
CommitDate: Wed Oct 9 16:36:50 2013 -0700

    i965/blorp: Rework sRGB override behavior.

    The previous code for sRGB overrides assumes that the source and
    destination formats are equal, other than the color space.  This won't
    be feasible when we add support for format conversions.

    Here are a few cases, and how the old code handled them:

    1.  RGB8 -> SRGB8, MSAA     ==>   SRGB8 -> SRGB8
    2.  RGB8 -> SRGB8, single   ==>    RGB8 -> RGB8
    3. SRGB8 ->  RGB8, MSAA     ==>    RGB8 -> RGB8
    4. SRGB8 ->  RGB8, single   ==>   SRGB8 -> SRGB8

    Apparently, preserving the behavior of #1 is important.  When doing a
    multisample to single-sample resolve, blending the samples together in
    an sRGB correct fashion results in a noticably higher quality image.
    It also is necessary to pass Piglit's EXT_framebuffer_multisample
    accuracy color tests.

    Paul, Eric, Anuj, and I talked about this, and aren't sure that it
    matters in the other cases.

    This patch preserves the behavior of #1, but otherwise reverts to
    doing everything in linear space, changing the behavior of case #4.

    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
    Reviewed-by: Eric Anholt <eric@anholt.net>
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

output:
Using test set: GL_ARB_texture_float
Testing GL_RGB16F_ARB
Unsupported framebuffer combination
Testing GL_RGBA16F_ARB
Testing GL_ALPHA16F_ARB
Unsupported framebuffer combination
Testing GL_LUMINANCE16F_ARB
Testing GL_LUMINANCE_ALPHA16F_ARB
Unsupported framebuffer combination
Testing GL_INTENSITY16F_ARB
Testing GL_RGB32F_ARB
Unsupported framebuffer combination
Testing GL_RGBA32F_ARB
Unsupported framebuffer combination
Testing GL_ALPHA32F_ARB
Unsupported framebuffer combination
Testing GL_LUMINANCE32F_ARB
Probe at (181,17)
  Expected: 0.015686
  Observed: 0.001963
Testing GL_LUMINANCE_ALPHA32F_ARB
Unsupported framebuffer combination
Testing GL_INTENSITY32F_ARB
Probe at (181,17)
  Expected: 0.015686
  Observed: 0.001963
PIGLIT: {'result': 'fail' }


Reproduce steps:
----------------------------
1. xinit&
2. bin/ext_framebuffer_multisample-formats -auto 2 GL_ARB_texture_float
Comment 1 Kenneth Graunke 2013-11-27 05:39:02 UTC
I can confirm this on Sandybridge.  (It appears to work on Ivybridge, which is strange...)

Apparently, the other thing the deleted code did was override LUMINANCE and INTENSITY floating point formats to RED.

The failing case appears to be a L32_FLOAT -> R32_FLOAT blit.  I don't understand why that should matter...sampling from L32 will return <R, R, R, 1> and writing to a single channel R32 render target ought to ignore the other channels...

I'll keep looking.
Comment 2 Kenneth Graunke 2013-11-27 08:04:09 UTC
Created attachment 89888 [details] [review]
Hack that forces I32F and L32F to be treated as R32F
Comment 3 Kenneth Graunke 2013-11-27 08:05:14 UTC
Created attachment 89889 [details]
Test output for I32F -> R32F blit (SNB) - blocky and bad
Comment 4 Kenneth Graunke 2013-11-27 08:05:55 UTC
Created attachment 89890 [details]
Test output with I32F forced to R32F (SNB) - correct
Comment 5 Kenneth Graunke 2013-11-27 08:13:19 UTC
Hey Paul,

I'm really stumped by this bug.  On Sandybridge, performing a BLORP blit from either I32_FLOAT or L32_FLOAT to R32_FLOAT appears to subtly break - the rendered image looks mostly correct, but it's all blocky, and fails at the edges.  (See the images attached to the bug.)  The 16-bit formats appear to work fine.

Forcing the source and destination buffers to both be R32_FLOAT appears (via the attached patch) makes it work correctly as well.  This makes no sense to me, as sampling from I32_FLOAT should just return <R, R, R, R>, and writing to a R32_FLOAT buffer should ignore all but the R component anyway...

Do you have any ideas?
Comment 6 Paul Berry 2013-11-27 15:50:40 UTC
(In reply to comment #5)
> Hey Paul,
> 
> I'm really stumped by this bug.  On Sandybridge, performing a BLORP blit
> from either I32_FLOAT or L32_FLOAT to R32_FLOAT appears to subtly break -
> the rendered image looks mostly correct, but it's all blocky, and fails at
> the edges.  (See the images attached to the bug.)  The 16-bit formats appear
> to work fine.
> 
> Forcing the source and destination buffers to both be R32_FLOAT appears (via
> the attached patch) makes it work correctly as well.  This makes no sense to
> me, as sampling from I32_FLOAT should just return <R, R, R, R>, and writing
> to a R32_FLOAT buffer should ignore all but the R component anyway...
> 
> Do you have any ideas?

I saw these sorts of blocky artifacts when I was first developing MSAA on Sandy Bridge.  They happen when something in the software (or hardware) implements the IMS multisample layout incorrectly.

IMS layout is organized into 2x2 blocks.  If the texels in a 2x2 block are:

A B

C D

And 4x multisampling is in use, then the data is encoded like this:

A0 B0 A1 B1
        *
C0 D0 C1 D1

A2 B2 A3 B3

C2 D2 C3 D3

Sandy Bridge has dedicated hardware to assist in doing multisample resolves: if, for example, the "sample" message is used to sample from the position marked with a "*" in the image above (at the corner shared by samples A1, B1, C1, and D1), then instead of the sampler doing its normal linear blending (which would cause it to average together samples A1, B1, C1, and D1), some special bit twiddling logic kicks in which causes it to instead average together samples B0, B1, B2, and B3, which is exactly what is needed for a multisample resolve.  This dedicated hardware doesn't exist in Ivy Bridge (presumably because this clever trick wouldn't work with Ivy Bridge's UMS and CMS formats).  Blorp has special case logic to use this dedicated hardware to do multisample resolves on Gen6 (grep for single_to_blend()).  On Gen7 it does the averaging manually.

I suspect what's going on is that the special bit twiddling logic isn't kicking in properly when the surface format is I32_FLOAT or L32_FLOAT, so the samples that are being averaged are A1, B1, C1, and D1, and that's causing the blocky artifacts.

Why the special bit twiddling logic isn't kicking in is anyone's guess.  It could be a hardware bug, or it could be due to some restriction that we've never noticed in the bspec (though I spent some time digging this morning and didn't find anything).  In any case, there's an easy solution: just do the blit as R32_FLOAT -> R32_FLOAT.

I'll send a patch out to mesa-dev as soon as I've run it through a full piglit run on Sandy Bridge.
Comment 7 Paul Berry 2013-11-27 16:44:33 UTC
Patch sent to mesa-dev for review: http://lists.freedesktop.org/archives/mesa-dev/2013-November/049247.html
Comment 8 lu hua 2013-11-29 05:54:08 UTC
(In reply to comment #7)
> Patch sent to mesa-dev for review:
> http://lists.freedesktop.org/archives/mesa-dev/2013-November/049247.html

Fixed by this patch.
Comment 9 Paul Berry 2013-11-30 06:17:49 UTC
Fixed by commit c4cf487315f1f5375534f1677177983fa496d577
Author: Paul Berry <stereotype441@gmail.com>
Date:   Wed Nov 27 07:43:03 2013 -0800

    i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats
    
    On gen6, multisamble resolve blits use the SAMPLE message to blend
    together the 4 samples for each texel.  For some reason, SAMPLE
    doesn't blend together the proper samples when the source format is
    L32_FLOAT or I32_FLOAT, resulting in blocky artifacts.
    
    To work around this problem, sample from the source surface using
    R32_FLOAT.  This shouldn't affect rendering correctness, because when
    doing these resolve blits, the destination format is R32_FLOAT, so the
    channel replication done by L32_FLOAT and I32_FLOAT is unnecessary.
    
    Fixes piglit tests on Sandy Bridge:
    - spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float
    - spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float
    
    No piglit regressions on Sandy Bridge.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601
    
    Cc: Kenneth Graunke <kenneth@whitecape.org>
    Cc: mesa-stable@lists.freedesktop.org
    
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Comment 10 lu hua 2013-12-02 07:13:52 UTC
Verified.Fixed.