Bug 106126 - eglMakeCurrent does not always ensure dri_drawable->update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first
Summary: eglMakeCurrent does not always ensure dri_drawable->update_drawable_info has ...
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/llvmpipe (show other bugs)
Version: 18.0
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Johan Helsing
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-18 14:33 UTC by Johan Helsing
Modified: 2019-09-18 18:32 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Apitrace of the failing test (54.58 KB, application/octet-stream)
2018-04-18 14:33 UTC, Johan Helsing
Details

Description Johan Helsing 2018-04-18 14:33:46 UTC
Created attachment 138907 [details]
Apitrace of the failing test

This is pretty timing sensitive, and I haven't been able to reproduce on
anything except a Qt unit test running against a headless Weston
compositor.

Start Weston with: weston --backend=headless-backend.so -i 0

Qt unit test: https://codereview.qt-project.org/#/c/225522/2//ALL

The symptom is that glReadPixels will only return 1 pixel instead of 16
which is the surface size.

Why it fails:

In the test, a temporary EGLSurface will be created as part of the window
setup. After that surface has been destroyed, the real window surface is
created. When that happens the malloc in driCreateNewDrawable may return
the same address the first surface's drawable had. Consequently, when
dri_make_current later tries to determine if it should update the
texture_stamp it compares the surface's drawable pointer against the
drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset then dri_st_framebuffer_validate thinks it
has already called update_drawable_info for that drawable, which is why, in
the test above, the size of the surface is not updated to match the Wayland
window and glReadPixels later is only going to return only one pixel.

The solution is to clear the dangling pointer to the destroyed drawable,
dri_context::dPriv. I've confirmed that this fixes the flakiness of my test
and will post a patch on the mailing list shortly.

I've tagged this bug with Drivers/Gallium/llvmpipe, but it really affects all
gallium dri drivers (and I've reproduced the issue with softpipe as well).

Related bug report in Qt: https://bugreports.qt.io/browse/QTBUG-67678
Comment 1 Timothy Arceri 2018-04-23 09:40:35 UTC
Fixed by:

author	Johan Klokkhammer Helsing <johan.helsing@qt.io>
commit	dab02dea3411d325a5aee6cda5b581e61396ecc6

st/dri: Fix dangling pointer to a destroyed dri_drawable

If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Comment 2 Johan Helsing 2018-04-24 08:19:52 UTC
The patch was reverted.
Comment 3 GitLab Migration User 2019-09-18 18:32:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/246.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.