Bug 87887

Summary: [i965 Bisected]ES2-CTS.gtf.GL.cos.cos_float_vert_xvary fails
Product: Mesa Reporter: lu hua <huax.lu>
Component: Drivers/DRI/i965Assignee: Matt Turner <mattst88>
Status: VERIFIED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: high CC: idr, kondapallykalyancontribute, michael.w.mason
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: patch

Description lu hua 2014-12-31 02:20:09 UTC
System Environment:
--------------------------
Platform: IVB
Libdrm:		(master)libdrm-2.4.58-19-gf99522e678dbbaffeca9462a8edcbe900574dc12
Mesa:		(master)64dcb2bb0a64258b895222fcf89da35bb8aa4338
Xserver:(master)xorg-server-1.16.99.901-102-g826e7c2b36f192fbbe7ddff37eb559f4d6301146
Xf86_video_intel:(master)2.99.917-1-g0d42b0ed25d4112e0b3e3218e5c42947bbeb9e27
Libva:		(master)e97ac9e78cd475a13e722c455e34d5d39d0f059d
Libva_intel_driver:(master)43bd81abdde40b50ac71f6f44eb04e4eaf5af5f6
Kernel:   (drm-intel-nightly)74e2b20ddb3898757ecc64534d6f3c3141ef5a31

Bug detailed description:
---------------------------
It fails on i965 with mesa master branch, works well on 10.4 branch.
Following cases also fail with the same bisect commit:
ES2-CTS.gtf.GL.mix.mix_float_vert_xvary_yconsthalf_aconsthalf
ES2-CTS.gtf.GL.mix.mix_vec2_vert_xvary_yconsthalf_aconsthalf
ES2-CTS.gtf.GL.sin.sin_float_vert_xvary
Bisect shows: 44573458bdc52acc304fb75d6df502312b8e149c is the first bad commit
commit 44573458bdc52acc304fb75d6df502312b8e149c
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Sat Dec 20 11:50:31 2014 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Mon Dec 29 10:08:18 2014 -0800

    i965/vec4: Add pass to gather constants into a vector-float MOV.

    Currently only handles consecutive instructions with the same
    destination that collectively write all channels.

    total instructions in shared programs: 5879798 -> 5869011 (-0.18%)
    instructions in affected programs:     465236 -> 454449 (-2.32%)

    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

output:
dEQP Core GL-CTS-2.0 (0x0052484b) starting..
  target implementation = 'X11'

Test case 'ES2-CTS.gtf.GL.cos.cos_float_vert_xvary'..
#+ GTF/GL/cos/cos_float_vert_xvary.shader1.ppm and GTF/GL/cos/cos_float_vert_xvary.shader2.ppm are different
  Fail (Fail)

DONE!

Test run totals:
  Passed:        0/1 (0.00%)
  Failed:        1/1 (100.00%)
  Not supported: 0/1 (0.00%)
  Warnings:      0/1 (0.00%)

Reproduce steps:
-------------------------
1. xinit
2. ./glcts --deqp-case=ES2-CTS.gtf.GL.cos.cos_float_vert_xvary
Comment 1 lu hua 2014-12-31 03:03:52 UTC
Following webglc cases also fail with the same  bisect commit:
conformance/glsl/functions/glsl-function-ceil.html
conformance/glsl/functions/glsl-function-clamp-float.html
conformance/glsl/functions/glsl-function-clamp-gentype.html
conformance/glsl/functions/glsl-function-cos.html
conformance/glsl/functions/glsl-function-floor.html
conformance/glsl/functions/glsl-function-normalize.html
conformance/glsl/functions/glsl-function-sign.html
conformance/glsl/functions/glsl-function-sin.html
conformance/glsl/functions/glsl-function-step-float.html
conformance/glsl/functions/glsl-function-step-gentype.html
Comment 2 Kenneth Graunke 2015-01-01 01:05:28 UTC
It looks like opt_vector_float is just deleting writes to m4:

-mov m4.yz:F, 0.000000F
-mov m4.w:F, 1.000000F
-mov vgrf4.0.x:F, 0.500000F
+mov vgrf4.0:F, [0.5F, 0F, 0F, 1F]
 mov vgrf5.0.x:F, 0.500000F
 mul vgrf7.0.x:F, attr17.xxxx:F, 6.283185F
 cos vgrf6.0.x:F, vgrf7.xxxx:F

I haven't looked at the code to determine why (I figure I'll let Matt fix it).
I do have patches on the list to make INTEL_DEBUG=optimizer work for debugging this.
Comment 3 Matt Turner 2015-01-01 01:36:31 UTC
(In reply to Kenneth Graunke from comment #2)
> It looks like opt_vector_float is just deleting writes to m4:
> 
> -mov m4.yz:F, 0.000000F
> -mov m4.w:F, 1.000000F
> -mov vgrf4.0.x:F, 0.500000F
> +mov vgrf4.0:F, [0.5F, 0F, 0F, 1F]
>  mov vgrf5.0.x:F, 0.500000F
>  mul vgrf7.0.x:F, attr17.xxxx:F, 6.283185F
>  cos vgrf6.0.x:F, vgrf7.xxxx:F
> 
> I haven't looked at the code to determine why (I figure I'll let Matt fix
> it).
> I do have patches on the list to make INTEL_DEBUG=optimizer work for
> debugging this.

Oh, that's awesome. It's seeing writes to all 4 channels of register 4. Register 4 just happens to be in different register files!
Comment 4 lu hua 2015-01-07 01:40:45 UTC
*** Bug 88115 has been marked as a duplicate of this bug. ***
Comment 5 Matt Turner 2015-01-12 18:48:59 UTC
Created attachment 112136 [details] [review]
patch

Patch sent to the mailing list. Please test.
Comment 6 lu hua 2015-01-13 05:47:28 UTC
(In reply to Matt Turner from comment #5)
> Created attachment 112136 [details] [review] [review]
> patch
> 
> Patch sent to the mailing list. Please test.

Test this patch on IVB/ILK, it works well.
Comment 7 Matt Turner 2015-01-15 18:11:58 UTC
Fixed with

commit 41d9f232b6a7f53086b9c428cca30e45905abd48
Author: Matt Turner <mattst88@gmail.com>
Date:   Mon Jan 12 10:48:04 2015 -0800

    i965/vec4: Make sure that imm writes are to registers in the same file.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
Comment 8 Matt Turner 2015-01-15 18:12:43 UTC
*** Bug 88247 has been marked as a duplicate of this bug. ***
Comment 9 lu hua 2015-01-19 07:29:34 UTC
Verified.Fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.