Bug 92595

Summary: [HSW,BDW,SKL][GLES 3.1 CTS] Big difference in the results for the ES31-CTS.shader_bitfield_operation.* tests
Product: Mesa Reporter: Marta Löfstedt <marta.lofstedt>
Component: Drivers/DRI/i965Assignee: Matt Turner <mattst88>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: currojerez, idr, marta.lofstedt
Version: 11.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 92778    
Attachments: bitfield failing
hack fix for BDW
Hack that fix the difference in results between HSW and BDW/SKL

Description Marta Löfstedt 2015-10-22 12:43:22 UTC
Created attachment 119066 [details]
bitfield failing

Software versions:
    4.2.0-1-generic
    OpenGL version string: 3.0 Mesa 11.1.0-devel (git-13a5805)

GPU hardware:
    OpenGL renderer string: Mesa DRI Intel(R) Haswell Desktop 
    00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [8086:0412] (rev 06)

CPU hardware:
    x86_64
    Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
---------------
Software versions:
    4.3.0-rc3+
    OpenGL version string: 3.0 Mesa 11.1.0-devel (git-13a5805)

GPU hardware:
    OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 5300 (Broadwell GT2)
    00:02.0 VGA compatible controller [0300]: Intel Corporation Broadwell-U Integrated Graphics [8086:161e] (rev 06)

CPU hardware:
    x86_64
    Genuine Intel(R) CPU 0000 @ 0.60GHz
----------------
Software versions:
    4.3.0-rc5-mainline+
    OpenGL version string: 3.0 Mesa 11.1.0-devel (git-13a5805)

GPU hardware:
    OpenGL renderer string: Mesa DRI Intel(R) Skylake ULX GT2
    00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:191e] (rev 07)

CPU hardware:
    x86_64
    Intel(R) Core(TM) m7-6Y75 CPU @ 1.20GHz
-----------------
CTS version:
git@67ae88f31295

command: 
./glcts --deqp-case=ES31-CTS.shader_bitfield_operation*

Environment:
Mesa built with: --enable-debug
export MESA_GLES_VERSION_OVERRIDE=3.1
export MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader
------------------
The shader bitfield test of the OpenGL ES 3.1 CTS, run builtin GLSL functions on various types set up in std430 layout. For example the: 
ES31-CTS.shader_bitfield_operation.frexp.vec4_0, the will run frexp(vec4, vec4), then dispatch compute, mapbufferrange out the result and compare with an expected result.
Unfortunately there is a big gap between the numbers of passing test for different HW.I attach a list of failing test for HSW, BDW and SKL.
Comment 1 Marta Löfstedt 2015-10-23 11:28:27 UTC
With Mesa: git@3994ef the number of failing tests on SKL-Y is not consistent. Over 3 runs I failed 73, 68, 54 tests.
Comment 2 Marta Löfstedt 2015-12-10 11:41:55 UTC
The number of passing test on SKL are stable at 307 with ubuntu 15.10 kernel 4.2.0, with kernel 4.4.rc4 the number of passing tests are different on each run.
Comment 3 Marta Löfstedt 2015-12-16 17:16:11 UTC
Created attachment 120550 [details]
hack fix for BDW
Comment 4 Marta Löfstedt 2015-12-16 17:16:57 UTC
Attach a patch that fixes BDW difference to HSW, but does not help with SKL.
Comment 5 Marta Löfstedt 2015-12-17 10:58:05 UTC
Forget about the patch I sent up. I can't reproduce my results any more.
Comment 6 Marta Löfstedt 2015-12-17 12:04:54 UTC
Created attachment 120561 [details]
Hack that fix the difference in results between HSW and BDW/SKL

Sorry, I attached the wrong version of the patch and did the same mistake to myself when testing on SKL. It is when I hack BRW_REGISTER_TYPE_UW for both signed and unsigned that I pass the imulExtended shader_bitfield tests on BDW and SKL(kernel 4.2.0).
Comment 7 Ian Romanick 2015-12-18 15:21:40 UTC
(In reply to Marta Löfstedt from comment #6)
> Created attachment 120561 [details]
> Hack that fix the difference in results between HSW and BDW/SKL
> 
> Sorry, I attached the wrong version of the patch and did the same mistake to
> myself when testing on SKL. It is when I hack BRW_REGISTER_TYPE_UW for both
> signed and unsigned that I pass the imulExtended shader_bitfield tests on
> BDW and SKL(kernel 4.2.0).

This code was originally added by 3b48a0ee (authored by Curro, reviewed by Matt).

For Gen8, it's not obvious to me why there is any modification of the source types of the MUL.  It doesn't help that the BDW public PRM has *no* documentation for the MUL or MACH instructions.

I'm also a little disappointed that our piglit tests didn't catch this.  I think we need a new test for [ui]mulExtended that tries more values.  I think making a big GL_RGBA32I texture that puts the sources in R and G and the expected results (calculated by C code) in B and A should be easy enough.
Comment 8 Matt Turner 2015-12-18 18:18:02 UTC
(In reply to Marta Löfstedt from comment #6)
> Created attachment 120561 [details]
> Hack that fix the difference in results between HSW and BDW/SKL
> 
> Sorry, I attached the wrong version of the patch and did the same mistake to
> myself when testing on SKL. It is when I hack BRW_REGISTER_TYPE_UW for both
> signed and unsigned that I pass the imulExtended shader_bitfield tests on
> BDW and SKL(kernel 4.2.0).

Wow, nice. I suspect that is in fact the correct fix -- the MUL intends to multiply the double word of src[0] with the low word of src[1], not the low word of src[1] with a reinterpreted sign bit.

Assuming that patch has no regressions, it'll get my Reviewed-by. Thanks Marta!
Comment 9 Marta Löfstedt 2015-12-29 15:18:48 UTC
I sent up my patch:
http://patchwork.freedesktop.org/patch/69228/

But this is only a prtial fix for this bug. We will also need to fix:

* the remaining 3 failing tests
* the odd behaviour on SKL on kernels later than 4.2.0, where a random number of the shader bitfield tests pass.
Comment 10 Marta Löfstedt 2015-12-30 08:36:32 UTC
My patch is pushed.

I agree with Ian that we should look into piglit tests for this issue. However, maybe we should solve the remaining issues first...
Comment 11 Marta Löfstedt 2015-12-30 14:40:58 UTC
For the two remaining bitfieldInsert failing tests:

ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2
ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3

I have noticed that offset is 0 and bits 32. 

None of the other passing bitfieldInsert.uint_N or bitfieldInsert.uvec4_N test has offset 0 or 32 bits. Nor does any of passing int_N or ivec4_N tests.
Comment 12 Matt Turner 2015-12-30 18:22:05 UTC
(In reply to Marta Löfstedt from comment #11)
> For the two remaining bitfieldInsert failing tests:
> 
> ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2
> ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3
> 
> I have noticed that offset is 0 and bits 32. 
> 
> None of the other passing bitfieldInsert.uint_N or bitfieldInsert.uvec4_N
> test has offset 0 or 32 bits. Nor does any of passing int_N or ivec4_N tests.

I think I see the problem. I'll work on a patch.
Comment 13 Matt Turner 2015-12-30 20:28:47 UTC
Patches sent:

[PATCH 1/2] glsl: Fix undefined shifts.
[PATCH 2/2] glsl: Handle bits=32 case in bitfieldInsert/bitfieldExtract.

All --deqp-case=ES31-CTS.shader_bitfield_operation* tests now pass.
Comment 14 Marta Löfstedt 2016-01-04 10:02:11 UTC
Thanks Matt!

I confirm that with Matts patches:
http://patchwork.freedesktop.org/patch/69262/
http://patchwork.freedesktop.org/patch/69263/

All shader bitfield test pass na HSW,BDW and SKL(4.2.0.22 Ubuntu kernel).

So, what remains for this BUG is to solve the SKL bad behaviour on later kernels.
Comment 15 Marta Löfstedt 2016-01-07 10:36:28 UTC
I can't reproduce the SKL "synchronization" issues with 
kernel 4.4.0.rc8+, drm-intel-nightly, 2016-01-07, git@b54e0f487f15213.

I.e. I believe this BUG could be closed once Matts patches are merged.
Comment 16 Matt Turner 2016-01-07 19:04:10 UTC
Excellent. Thank you, Marta. The second patch needs some more work on my side, but I will take care of it.
Comment 17 Matt Turner 2016-01-11 22:48:05 UTC
I've sent a new 9 patch series that fixes this.
Comment 18 Marta Löfstedt 2016-01-13 12:00:20 UTC
I confirm that the 3 remaining remaining tests also pass without CTS regressions on HSW,BDW,SKL with Matts new patches:

http://patchwork.freedesktop.org/series/2364/
Comment 19 Matt Turner 2016-01-13 19:28:50 UTC
First seven patches are committed. Two new patches on the list to replace the last two of the previous series.
Comment 20 Marta Löfstedt 2016-01-14 10:55:17 UTC
Thats great Matt,

I confirm that with mesa git@f4ab7340c, and
http://patchwork.freedesktop.org/patch/70428
http://patchwork.freedesktop.org/patch/70429

All 3 remaining tests pass.
Comment 21 Matt Turner 2016-01-14 17:31:25 UTC
Thanks Marta. I've added your Tested-by and pushed the two patches.

commit 15640ee77ae601cba33cbbc72256e55e03a363e5
Author: Matt Turner <mattst88@gmail.com>
Date:   Wed Jan 13 11:08:37 2016 -0800

    nir: Handle <bits>=32 case in bitfield_insert lowering.

commit b82e26a6a4d6baf121f44c61c862bfa79ba0d172
Author: Matt Turner <mattst88@gmail.com>
Date:   Wed Jan 13 11:09:11 2016 -0800

    nir: Lower bitfield_extract.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.