Bug 61773 - abort is an incredibly not-smart way to handle IR validation
abort is an incredibly not-smart way to handle IR validation
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: glsl-compiler
9.1
All All
: medium critical
Assigned To: Ian Romanick
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-04 02:54 UTC by gmad
Modified: 2013-05-02 18:46 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Fix for spurious aborts during GLSL compile. (2.17 KB, text/plain)
2013-03-04 02:54 UTC, gmad
Details
backtrace and shader code causing ir_validation failure (4.52 KB, text/plain)
2013-03-05 05:46 UTC, gmad
Details

Note You need to log in before you can comment on or make changes to this bug.
Description gmad 2013-03-04 02:54:18 UTC
Created attachment 75878 [details]
Fix for spurious aborts during GLSL compile.

Asking GLSL to compile a shader should NOT involve the abort of the calling application. PERIOD.

It is completely unreasonable to expect that only perfect code will be submitted to glsl_compile() and it is perfectly possible to handle IR validation failure more gracefully.

This becomes even more important when the validation is broken and failing on code that does not contain an IR error of any kind.

I've attached a patch which makes the use of abort() occur only in debug builds and fixed a crash site where a call through a NULL pointer was being made in parameter_lists_match.

I discovered these errors while running SecondLife on Gallium so the crash can be replicated with that driver using the latest version of SL available here:

http://download.cloud.secondlife.com/Viewer-3/SecondLife-i686-3.4.5.270263.tar.bz2
Comment 1 Ian Romanick 2013-03-04 19:39:26 UTC
(In reply to comment #0)
> Created attachment 75878 [details]
> Fix for spurious aborts during GLSL compile.
> 
> Asking GLSL to compile a shader should NOT involve the abort of the calling
> application. PERIOD.
> 
> It is completely unreasonable to expect that only perfect code will be
> submitted to glsl_compile() and it is perfectly possible to handle IR
> validation failure more gracefully.

That's a little like saying assert shouldn't call abort.  That validation pass is a big assertion check.  The check isn't that the shader code is valid, the check is that the internal representation is valid after various optimization passes.  Just removing the abort will almost always lead to a different kind of crash somewhere else... there's invalid IR, and the backend will probably choke on it.

Cases of invalid shader code should be rejected long before ir_validate could ever be called.

> This becomes even more important when the validation is broken and failing
> on code that does not contain an IR error of any kind.

Could you provide an apitrace of a specific failure or provide the code of the shader that triggers the failure?

> I've attached a patch which makes the use of abort() occur only in debug
> builds and fixed a crash site where a call through a NULL pointer was being
> made in parameter_lists_match.

I'm especially interested in this crash.  Could you provide a backtrace?

> I discovered these errors while running SecondLife on Gallium so the crash
> can be replicated with that driver using the latest version of SL available
> here:
> 
> http://download.cloud.secondlife.com/Viewer-3/SecondLife-i686-3.4.5.270263.
> tar.bz2
Comment 2 gmad 2013-03-05 05:46:07 UTC
Created attachment 75938 [details]
backtrace and shader code causing ir_validation failure

Here's a backtrace of the GLSL compile fail along with what I believe is the code being compiled at the time of the failure.
Comment 3 Kenneth Graunke 2013-05-01 01:56:41 UTC
I just committed a patch to Mesa master which fixed very similar looking crashes:

commit 6c5cf8baa10f0bfdf1e61d944f24fdd7947b2a82
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Tue Apr 30 00:58:09 2013 -0700

    glsl: Ignore redundant prototypes after a function's been defined.

Could you build the latest Mesa master and give it a try?  Thanks!
Comment 4 Kenneth Graunke 2013-05-02 18:35:36 UTC
Reported as fixed in #39251.  Closing.
Comment 5 gmad 2013-05-02 18:46:57 UTC
I'll give it a try when I get a chance...sounds plausible since the var it was crashing on validating was likely from a "forward decl" as described in the fix.