Bug 99817

Summary:	[softpipe] piglit glsl-fs-tan-1 regression
Product:	Mesa	Reporter:	Vinson Lee <vlee>
Component:	Mesa core	Assignee:	mesa-dev
Status:	RESOLVED FIXED	QA Contact:	mesa-dev
Severity:	normal
Priority:	medium	CC:	currojerez, idr
Version:	git	Keywords:	bisected, regression
Hardware:	x86-64 (AMD64)
OS:	Linux (All)
Whiteboard:
i915 platform:		i915 features:
Attachments:	0001-st-glsl_to_tgsi-Translate-ir_unop_neg-into-MOV-with-.patch

Description Vinson Lee 2017-02-14 22:43:51 UTC

mesa: 924a8cbb408e7dd110d172093908f99926b7970d (master 17.1.0-devel)

$ ./bin/shader_runner tests/shaders/glsl-fs-atan-1.shader_test -auto
Probe color at (0,0)
  Expected: 0.125000 0.125000 0.125000 0.125000
  Observed: 0.211765 0.211765 0.211765 0.211765
Test failure on line 27
Probe color at (125,0)
  Expected: 0.250000 0.250000 0.250000 0.250000
  Observed: 0.043137 0.043137 0.043137 0.043137
Test failure on line 28
Probe color at (249,0)
  Expected: 0.375000 0.375000 0.375000 0.375000
  Observed: 0.000000 0.000000 0.000000 0.000000
Test failure on line 29
PIGLIT: {"result": "fail" }

Comment 1 Francisco Jerez 2017-02-15 00:35:11 UTC

This seems to be caused by the atan() built-in implementation now using ir_triop_csel instead of control flow, which is translated incorrectly by the mesa state tracker on drivers with native integer support when one of the arguments is floating point and has a source modifier enabled, because glsl_to_tgsi emits an integer UCMP instruction in that case, which has different (integer) modifier semantics.

Comment 2 Francisco Jerez 2017-02-15 00:37:03 UTC

Created attachment 129616 [details] [review]
0001-st-glsl_to_tgsi-Translate-ir_unop_neg-into-MOV-with-.patch

Here's a (largely untested) potential fix.

Comment 3 Vinson Lee 2017-02-15 23:18:37 UTC

e9ffd12827ac11a2d2002a42fa8eb1df847153ba is the first bad commit
commit e9ffd12827ac11a2d2002a42fa8eb1df847153ba
Author: Francisco Jerez <currojerez@riseup.net>
Date:   Sat Jan 21 13:41:08 2017 -0800

    glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
    
    This addresses several issues of the current atan2 implementation:
    
     - Negative zero (and negative denorms which end up getting flushed to
       zero) isn't handled correctly by the current implementation.  The
       reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
       on which side of the branch cut the argument is, which causes us to
       return incorrect results (off by up to 2π) for very small negative
       values.
    
     - There is a serious precision problem for x values of large enough
       magnitude introduced by the floating point division operation being
       implemented as a mul+rcp sequence.  This can lead to the quotient
       getting flushed to zero in some cases introducing an error of over
       8e6 ULP in the result -- Or in the most catastrophic case will
       cause us to return NaN instead of the correct value ±π/2 for y=±∞
       and x very large.  We can fix this easily by scaling down both
       arguments when the absolute value of the denominator goes above
       certain threshold.  The error of this atan2 implementation remains
       below 25 ULP in most of its domain except for a neighborhood of y=0
       where it reaches a maximum error of about 180 ULP.
    
     - It emits a bunch of instructions including no less than three
       if-else branches per scalar component that don't seem to get
       optimized out later on.  This implementation uses about 13% less
       instructions on Intel SKL hardware and doesn't emit any control
       flow instructions.
    
    v2: Fix up argument scaling to take into account the range and
        precision of exotic FP24 hardware.  Flip coordinate system for
        arguments along the vertical line as if they were on the left
        half-plane in order to avoid division by zero which may give
        unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
        some more comments.
    
    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

:040000 040000 acb11a161d3a4b78c246efd2d3720e8d66c8772a 89658b9bde1aa03b542528f4cae464516e8db300 M	src
bisect run success

Comment 4 Vinson Lee 2017-02-15 23:46:51 UTC

attachment 129616 [details] [review] fixes the regression.

Tested-by: Vinson Lee <vlee@freedesktop.org>

Comment 5 Vinson Lee 2017-03-06 02:56:14 UTC

These piglit tests have also regressed.

glsl-fs-atan-2
glsl-fs-atan-3
fs-atan-float-float 
fs-atan-vec2-vec2
fs-atan-vec3-vec3
fs-atan-vec4-vec4
vs-atan-float-float
vs-atan-vec2-vec2
vs-atan-vec3-vec3
vs-atan-vec4-vec4
gs-atan-float-float
gs-atan-vec2-vec2
gs-atan-vec3-vec3
gs-atan-vec4-vec4

Comment 6 Vinson Lee 2017-03-23 22:13:29 UTC

commit e6469ec43b25898e99766a30aa8f54cc64c3bc04
Author: Francisco Jerez <currojerez@riseup.net>
Date:   Mon Mar 13 17:31:39 2017 -0700

    gallium/tgsi: Treat UCMP sources as floats to match the GLSL-to-TGSI pass expectations.
    
    Currently the GLSL-to-TGSI translation pass assumes it can use
    floating point source modifiers on the UCMP instruction.  See the bug
    report linked below for an example where an unrelated change in the
    GLSL built-in lowering code for atan2 (e9ffd12827ac11a2d2002a42fa8eb1)
    caused the generation of floating-point ir_unop_neg instructions
    followed by ir_triop_csel, which is translated into UCMP with a negate
    modifier on back-ends with native integer support.
    
    Allowing floating-point source modifiers on an integer instruction
    seems like rather dubious design for a transport IR, since the same
    semantics could be represented as a sequence of MOV+UCMP instructions
    instead, but supposedly this matches the expectations of TGSI
    back-ends other than tgsi_exec, and the expectations of the DX10 API.
    I take no responsibility for future headaches caused by this
    inconsistency.
    
    Fixes a regression of piglit glsl-fs-tan-1 on softpipe introduced by
    the above-mentioned glsl front-end commit.  Even though the commit
    that triggered the regression doesn't seem to have made it to any
    stable branches yet, this might be worth back-porting since I don't
    see any reason why the bug couldn't have been reproduced before that
    point.
    
    Suggested-by: Roland Scheidegger <sroland@vmware.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99817
    Reviewed-by: Roland Scheidegger <sroland@vmware.com>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.