106393 – glsl-fs-shader-stencil-export hangs forever

Bug 106393 - glsl-fs-shader-stencil-export hangs forever

Summary: glsl-fs-shader-stencil-export hangs forever

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/i965 (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Iago Toral
QA Contact:	Intel 3D Bugs Mailing List

URL:
Whiteboard:
Keywords:	bisected, regression

Depends on:
Blocks:

Reported:	2018-05-03 21:41 UTC by Mark Janes
Modified:	2018-05-05 10:53 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments

Description Mark Janes 2018-05-03 21:41:03 UTC

This test, and fbo-depth-array, intermittently spin forever.

Bisected to:
96b51537908cd2aace85f54b437eeb72e6346b7e
Author:     Iago Toral Quiroga <itoral@igalia.com>

i965/compiler: handle conversion to smaller type in the lowering pass for that

The lowering pass was specialized to act on 64-bit to 32-bit conversions only,
but the implementation is valid for other cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Comment 1 Mark Janes 2018-05-03 22:34:13 UTC

I reverted the bisected patch, and the subsequent patch enabling shaderInt16.  Mesa CI was inoperable with the spinning piglit tests.

Comment 2 Iago Toral 2018-05-04 08:15:14 UTC

(In reply to Mark Janes from comment #1)
> I reverted the bisected patch, and the subsequent patch enabling
> shaderInt16.  Mesa CI was inoperable with the spinning piglit tests.

We had never seen this reported by Jenkins but I can confirm that we can reproduce the problem locally, was this test temporarily disabled or something? or could it be silently skipped due to the long run time? We sent the series to Jenkins prior to landing it in master.

Anyway, we have looked into it and it seems the problem is a MOV with a Byte type:

mov(8)          g127<1>UB       g2<32,8,4>UB

These need special handling in the compiler, Chema had already found about this while adding support for 8-bit integers, we just weren't aware that the driver already had a case where Byte types were already being used... I'll talk to Chema about this but we probably already have the fix available.

Comment 3 Iago Toral 2018-05-04 11:26:56 UTC

A series that fixes this is available for review here:
https://lists.freedesktop.org/archives/mesa-dev/2018-May/194049.html

Comment 4 Mark Janes 2018-05-04 17:00:24 UTC

Unfortunately, this behavior reproduced intermittently.  By chance, your test runs did not trigger it.  When it was pushed to master, it occurred on a handful of runs over the course of a day.  Unfortunately, the result was that machines were taken off line (or *fortunately*, because the issue was easy to spot).

It was easy to reproduce when I was bisecting, so I'm not sure why it passed at all in CI.

At any rate, the Igalia devs did everything right with their testing of this series.

Comment 5 Iago Toral 2018-05-05 10:53:14 UTC

Fixed in master with:

commit 5a12bdac09496e00b746421f4c10d77f9b7a8e66
Author: Iago Toral Quiroga <itoral@igalia.com>
Date:   Fri May 4 11:33:07 2018 +0200

    i965/compiler: handle conversion to smaller type in the lowering pass for that
    
    This rollbacks the revert of this same patch introduced in
    commit 7b9c15628aae8729118b648f5f473e6ac926b99b.
    
    And also squahes the following patch to prevent a piglit regression caused
    by this change:
    
    intel/compiler: Fix lower_conversions for 8-bit types.
    Author: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
    
    For 8-bit types the execution type is word. A byte raw MOV has 16-bit
    execution type and 8-bit destination and it shouldn't be considered
    a conversion case. So there is no need to change alignment and enter
    in lower_conversions for these instructions.
    
    Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
    that is introduced with this patch from the Vulkan shaderInt16 series:
    'i965/compiler: handle conversion to smaller type in the lowering
    pass for that'. The problem is caused because there is already a case
    in the driver that injects Byte instructions like this:
    
    mov(8)          g127<1>UB       g2<32,8,4>UB
    
    And the aforementioned pass was not accounting for the special
    handling of the execution size of Byte instructions. This patch
    fixes this.
    
    v2: (Jason Ekstrand)
       - Simplify is_byte_raw_mov, include reference to PRM and not
       consider B <-> UB conversions as raw movs.
    
    v3: (Matt Turner)
       - Indentation style fixes.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
    Tested-by: Mark Janes <mark.a.janes@intel.com>
    Reviewed-by: Matt Turner <mattst88@gmail.com>
    Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.