Bug 91970

Summary: [BSW regression] dEQP-GLES3.functional.shaders.precision.int.highp_mul_vertex
Product: Mesa Reporter: Mike Mason <michael.w.mason>
Component: Drivers/DRI/i965Assignee: Kristian Høgsberg <krh>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: gavin.hindman, james.ausmus, mark.a.janes
Version: 10.6   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: patch
Fix

Description Mike Mason 2015-09-11 01:07:56 UTC
Found this regression while testing 10.6.6 on a bsw Chromebook (cyan) with additional Chrome OS patches. The same test passes on 10.6.6 on snd, byt, hsw, bdw, ivb and skl Chrome OS devices. I reran the test 5 times on bsw and it consistently failed.

Failure message:

"Number of accurate bits assumed = 32",
"iter 0, test 0: in0 = -1056263790, in1 = -1493720033, ref out = 971704494 / 0x39eb08ae",
"Comparison failed (at 0, 0): got 1119553710 / 0x42bb08ae"

When the test passes, the output has the 'accurate bits' line, followed by 128 lines similar to this:

"iter 0, test 0: in0 = -1056263790, in1 = -1493720033, ref out = 971704494 / 0x39eb08ae",
"iter 0, test 1: in0 = 1938634019, in1 = -1133464026, ref out = 2114766130 / 0x7e0cc532",
"iter 0, test 2: in0 = 677512731, in1 = 867076180, ref out = -624797476 / 0xdac258dc",
Comment 1 Kaveh 2015-09-14 19:29:30 UTC
Mark Janes is bisecting.
Comment 2 Mark Janes 2015-09-16 14:52:57 UTC
I bisected to this to 

1e4e17fbd9296cc5064aabdb351a894d10190cb6
i965/fs: Lower integer multiplication after optimizations.

I noticed that the test did not fail reliably, so I will repeat the bisect.

Matt, does the commit seems like a plausible source of this failure?
Comment 3 Matt Turner 2015-09-16 15:28:14 UTC
Yes, that seems very plausible.
Comment 4 Matt Turner 2015-09-16 15:46:20 UTC
Created attachment 118317 [details] [review]
patch

The relevant instructions generated are

mul(8)          g124<1>D        g6<8,8,1>D      g10<8,8,1>UW    { align1 1Q };
mul(8)          g7<1>D          g6<8,8,1>D      g10<8,8,1>UW    { align1 1Q };
...
add(8)          g124.1<2>UW     g124.1<16,8,2>UW g7<16,8,2>UW   { align1 1Q };

I could imagine that the D * UW multiplies don't work as expected (maybe we should do D * W for signed multiplication?). Attached is a patch that gives that a try.
Comment 5 Mark Janes 2015-09-16 20:16:16 UTC
My second bisect confirmed the commit that regressed this test.  I am testing Matt's patch.

BTW, that patch doesn't apply cleanly to 10.6.
Comment 6 Mark Janes 2015-09-16 21:18:52 UTC
The patch provided breaks a bunch of tests, and doesn't fix the one listed in the bug.
Comment 7 Kristian Høgsberg 2015-09-23 22:30:08 UTC
(In reply to Matt Turner from comment #4)
> Created attachment 118317 [details] [review] [review]
> patch
> 
> The relevant instructions generated are
> 
> mul(8)          g124<1>D        g6<8,8,1>D      g10<8,8,1>UW    { align1 1Q
> };
> mul(8)          g7<1>D          g6<8,8,1>D      g10<8,8,1>UW    { align1 1Q

We're missing a .1 subreg offset on one of these.

> };
> ...
> add(8)          g124.1<2>UW     g124.1<16,8,2>UW g7<16,8,2>UW   { align1 1Q
> };
> 
> I could imagine that the D * UW multiplies don't work as expected (maybe we
> should do D * W for signed multiplication?). Attached is a patch that gives
> that a try.
Comment 8 Kristian Høgsberg 2015-09-23 22:31:16 UTC
If you enable the imul lowering on BDW, it fails there too.
Comment 9 Kristian Høgsberg 2015-09-23 23:26:26 UTC
g10 is from from ATTR file and we dont respect the subreg offset and stride for those.
Comment 10 Kristian Høgsberg 2015-09-23 23:56:15 UTC
Created attachment 118418 [details] [review]
Fix

We need to carry the ATTR stride and subreg information over to the hw reg when we assign hw regs to attributes.
Comment 11 Kristian Høgsberg 2015-09-24 18:44:36 UTC
Pushed to master:

commit 2ea16966ae66d4dd5c5dcb996d7996d9c734bbee
Author: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Date:   Wed Sep 23 16:57:47 2015 -0700

    i965: Respect stride and subreg_offset for ATTR registers
    
    When we assign hw regs to attributes, we don't incorporate the stride
    and subreg_offset from the fs_reg. It's rarely used, but the integer
    multiplication lowering uses unusual stride and subreg_offset
    combination breaks when one source is an attribute.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970
    Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
    Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
    Reviewed-by: Matt Turner <mattst88@gmail.com>

 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.