Bug 105065

Summary: Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)
Product: Mesa Reporter: Mark Janes <mark.a.janes>
Component: Drivers/DRI/i965Assignee: Scott D Phillips <scott.d.phillips>
Status: RESOLVED NOTOURBUG QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: devurandom
Version: 17.3   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 104757    
Attachments: Patch to disable ARB_get_program_binary for compat profiles

Description Mark Janes 2018-02-12 23:38:39 UTC
In testing Mesa 18.0rc3, I noticed that Qt programs failed to render at all.  Lionel Landwerlin isolated this to Qt's binary shader cache.

Jason Ekstrand found an easy reproduction:

 1) run Qt program with Mesa master
 2) change a source file, without changing the git commit, and recompile Mesa to generate a new Build ID.
 3) run Qt program again

Laszlo Agocs pointed out that the Qt cache does check the link status appropriately when using cached binaries.

We need to step into Qt's mechanism to understand why/how the cached binaries are being used.
Comment 1 Scott D Phillips 2018-02-13 02:12:51 UTC
I've traced through the failure. Here is what I see happen
(line numbers are from the qt 5.10 branch):

1) qopenglshaderprogram.cpp:3825: if (binCache.load(cacheKey, q->programId())) {

does glProgramBinary(). Mesa doesn't like the binary and sets the link
status to failure.

2) qopenglshaderprogram.cpp:3825: bool ok = q->link();

To check link status

3) qopenglshaderprogram.cpp:1297: d->glfuncs->glGetProgramiv(program, GL_LINK_STATUS, &value);

Sees that glProgramBinary set link status to failed.

4) qopenglshaderprogram.cpp:1303: d->glfuncs->glLinkProgram(program);

Links the program, however no shaders have been attached. Because a
compat context is being used, this is valid and linking succeeds.

5) qopenglshaderprogram.cpp:1324: return d->linked;

returns true;

6) qopenglshaderprogram.cpp:3831: needsCompile = false;

Won't try to compile, have a program with no shaders at this point.


It seems like either checking link status in
QOpenGLProgramBinaryCache::setProgramBinary or not linking a program
with no shaders in QOpenGLShaderProgram::link would fix this.
Comment 2 Scott D Phillips 2018-02-14 20:37:10 UTC
I filed QTBUG-66420 for this issue.
https://bugreports.qt.io/browse/QTBUG-66420
Comment 3 Kenneth Graunke 2018-02-15 00:53:53 UTC
Although it sounds like this is not our bug, I think we ought to revert it from 18.0 to give the Qt developers time to patch it and have the fix reach distros.  This has the potential to impact a ton of KDE users, and completely breaks KMail rendering, which is painful for me personally.  Breaking people's basic desktop functionality is something we usually try to avoid, if at all possible...
Comment 4 Jordan Justen 2018-02-15 01:13:16 UTC
(In reply to Kenneth Graunke from comment #3)
> Although it sounds like this is not our bug,

I think we can upgrade from 'sounds like this is not' to
'it is not'. :)

> I think we ought to revert it
> from 18.0 to give the Qt developers time to patch it and have the fix reach
> distros.  This has the potential to impact a ton of KDE users, and
> completely breaks KMail rendering, which is painful for me personally. 
> Breaking people's basic desktop functionality is something we usually try to
> avoid, if at all possible...

Reverting it will cause a few games to break. Those games were also
not following the spec. (Like QT.)

The impact from QT users will probably be more significant.

But, end users are less likely to be affected because if the version
string changes, then I think they won't try to use the binary. But,
if the distro releases the same version of Mesa more than once, then
there could be issues.

Scott also found that the one app he tested was using compatibility
profile. If basically all QT apps also use compatibility profile,
then not exposing it for compatibility profiles might be another
way to work around the QT bug.
Comment 5 Mark Janes 2018-02-15 05:28:49 UTC
By "revert it" are we considering disabling glGetProgramBinary?
Comment 6 Jordan Justen 2018-02-15 08:15:50 UTC
Created attachment 137370 [details] [review]
Patch to disable ARB_get_program_binary for compat profiles
Comment 7 Jordan Justen 2018-02-15 08:17:13 UTC
(In reply to Kenneth Graunke from comment #3)
> This has the potential to impact a ton of KDE users, and
> completely breaks KMail rendering, which is painful for me personally. 

Does this patch fix KMail for you?
Comment 8 Scott D Phillips 2018-02-15 18:09:27 UTC
(In reply to Jordan Justen from comment #4)
> Scott also found that the one app he tested was using compatibility
> profile. If basically all QT apps also use compatibility profile,
> then not exposing it for compatibility profiles might be another
> way to work around the QT bug.

Specifically, this issue in QT will only happen with compat contexts. With a non-compat context, Step 4 in my list above will result in LINK_STATUS=FALSE and QT will go down the recompiling path.
Comment 9 Scott D Phillips 2018-02-21 01:55:14 UTC
(In reply to Jordan Justen from comment #6)
> Created attachment 137370 [details] [review] [review]
> Patch to disable ARB_get_program_binary for compat profiles

Tested-by: Scott D Phillips <scott.d.phillips@intel.com>

(~/.cache/qtshadercache never gets made as expected)
Comment 10 Jordan Justen 2018-02-23 23:16:24 UTC
For the 18.0 release, we added a work-around to the QT bug for i965:

commit 719f2c934030f74ce0a4892233f494f168852698
i965: Support 0 ARB_get_program_binary formats for compat profiles

Thanks Scott for debugging/testing/reviewing this!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.