Bug 102855 - [OpenGL CTS] Multiple crashes in enhanced layout tests due to incorrect vue setup
Summary: [OpenGL CTS] Multiple crashes in enhanced layout tests due to incorrect vue s...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 102590
  Show dependency treegraph
 
Reported: 2017-09-19 09:32 UTC by Iago Toral
Modified: 2017-10-09 07:47 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Iago Toral 2017-09-19 09:32:09 UTC
The following tests are affected:

KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations

The crash is a consequence of this commit:

commit 99df02ca26f6127c8fa24d38a8a069ac6159356a
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Wed Sep 9 16:21:56 2015 -0700

    i965: Don't re-layout varyings for separate shader programs.

That commit changed the VUE setup so we use fixed layouts for SSO. For that, slots in the VUE are not assigned consecutively, but instead are assigned based on their location:

slot = first_generic_slot + varying - VARYING_SLOT_VAR0;

Where first_generic_slot is the first slot available for user varyings after assigning builtins.

The crash happens because with this method it is possible to assign slots outside the valid range (0..31). For example, shaders can assign location=31, which is valid, but since that will be offset by the number of builtin slots, we end up with a slot >= 32, and we trigger in assertion later on.

The crashes go away if we assign locations consecutively of course, but the tests still fail, so there are probably more issues to fix besides this.

Specifically for the crashes, I think we only have two options:

1. Revert the commit.

2. Reduce the number of varyings exposed by the driver to account for the maximum number of builtin slots we can have so we ensure we never run into this problem.

I think that 2) would be the worst option. Ken, what are your thoughts?
Comment 1 Kenneth Graunke 2017-09-19 16:34:43 UTC
The problem seems to be that FS inputs / SF outputs go beyond 32, which is the limit imposed by the SF attribute index.

I think we can use URB Entry Read Offset to work around this...currently we hardcode it to 0 or 1...but we could probably skip over a few more.  In particular, we have clip distance in the VUE slots, but the FS shouldn't care, which gives us 2 more slots (1 offset)...

We'd still run into problems if someone tries to use an explicit max location and read gl_ViewportIndex / gl_LayerID in the FS, but...that's at least much less common of a bug.
Comment 2 Iago Toral 2017-09-20 06:38:50 UTC
(In reply to Kenneth Graunke from comment #1)
> The problem seems to be that FS inputs / SF outputs go beyond 32, which is
> the limit imposed by the SF attribute index.

I see, for some reason I always thought that was a limitation for the entire pipeline and not just the FS inputs... good to know.
 
> I think we can use URB Entry Read Offset to work around this...currently we
> hardcode it to 0 or 1...but we could probably skip over a few more.  In
> particular, we have clip distance in the VUE slots, but the FS shouldn't
> care, which gives us 2 more slots (1 offset)...

Yes, that's a good point... it may just be enough to make the tests pass although it is still not a proper fix. I'll give it a try and comment on the results.

> We'd still run into problems if someone tries to use an explicit max
> location and read gl_ViewportIndex / gl_LayerID in the FS, but...that's at
> least much less common of a bug.

Yeah... it is not ideal but maybe it is unlikely that real applications would run into this issue anyway. If this is enough to make the tests happy I guess we can wait to see if we get bug reports to decide if we need to take more actions.

Thanks for the quick feedback Ken!
Comment 3 Kenneth Graunke 2017-09-20 16:50:13 UTC
(In reply to Iago Toral from comment #2)
> (In reply to Kenneth Graunke from comment #1)
> Yeah... it is not ideal but maybe it is unlikely that real applications
> would run into this issue anyway. If this is enough to make the tests happy
> I guess we can wait to see if we get bug reports to decide if we need to
> take more actions.
> 
> Thanks for the quick feedback Ken!

If we cared about that, we could always read those fields from the FS thread payload...the only reason we read it from the VUE is that the spec requires you to be able to pass an arbitrary 32-bit integer from one stage to the next...while the thread payload field has a smaller bit-width.  So, in-range values would work fine, and only out-of-range values would break in this case.  Nobody's ever going to hit that outside of a test, IMO.
Comment 4 Iago Toral 2017-09-22 08:14:26 UTC
(In reply to Iago Toral from comment #2)
> > I think we can use URB Entry Read Offset to work around this...currently we
> > hardcode it to 0 or 1...but we could probably skip over a few more.  In
> > particular, we have clip distance in the VUE slots, but the FS shouldn't
> > care, which gives us 2 more slots (1 offset)...
> 
> Yes, that's a good point... it may just be enough to make the tests pass
> although it is still not a proper fix. I'll give it a try and comment on the
> results.

I looked into it and this is what I found so far:

- gl_ClipDistance is defined as an input for fragment shaders, so we can't blindly skip it, only if we know the FS is not reading it and only if they are the fist builtin varying slots that come after a vue header that we are already skipping.

- There is a lowering pass in NIR that can remap cull distances to clip distance slots, so we can see clip distances slots in use in a FS that doesn't read gl_ClipDistance but reads gl_CullDistance and we can't skip them in that case (in practice this doesn't change anything for us, but I guess it is worth mentioning)

- Even with the above caveats, the fix is enough to make the tests not crash any more, so it is sufficient as a work around for these crashes.

I am running the patch through jenkins, if I don't see anything weird I'll send it for review.
Comment 5 Kenneth Graunke 2017-09-22 20:10:21 UTC
Oh, I missed that gl_ClipDistance is an FS input. :(  Still, should cover a lot of cases...

Thanks for working on this, Iago!
Comment 6 Iago Toral 2017-09-26 07:53:43 UTC
I have sent a patch for review here:
https://lists.freedesktop.org/archives/mesa-dev/2017-September/170587.html
Comment 7 Kenneth Graunke 2017-10-07 03:08:01 UTC
Iago fixed the crashes with:

commit 5e584a9db7f2900a8ed13575098940b79c84d98b
Author: Iago Toral Quiroga <itoral@igalia.com>
Date:   Wed Sep 20 09:22:51 2017 +0200

    i965: skip reading unused slots at the begining of the URB for the FS

Not all of the tests pass, but the crashes are gone at least.
Comment 8 Iago Toral 2017-10-09 07:47:33 UTC
(In reply to Kenneth Graunke from comment #7)
> Iago fixed the crashes with:
> 
> commit 5e584a9db7f2900a8ed13575098940b79c84d98b
> Author: Iago Toral Quiroga <itoral@igalia.com>
> Date:   Wed Sep 20 09:22:51 2017 +0200
> 
>     i965: skip reading unused slots at the begining of the URB for the FS
> 
> Not all of the tests pass, but the crashes are gone at least.

Yes, the original issue reported here have been addressed.

For the record, I have been looking into the failures and sent a few patches to CTS that fix a couple of tests for us and almost fixes two others (there are some issues with the cases that test doubles that I still need to investigate).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.