Bug 69622 - eglTerminate then eglMakeCurrent crahes
Summary: eglTerminate then eglMakeCurrent crahes
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium critical
Assignee: Chad Versace
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-20 19:18 UTC by Chad Versace
Modified: 2016-07-29 03:24 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chad Versace 2013-09-20 19:18:00 UTC
According to the EGL 1.4 spec, eglTerminate(dpy) will mark the current context for pending deletion. Later, then the context is unbound with eglMakeCurrent(dpy, NULL, NULL, NULL), the context gets flushed then deleted.

The above call to eglMakeCurrent crashes the Intel driver due
due to a use-after-free of intel_bufmgr. eglTerminate destroyed the intel_screen and its bufmgr, so the bufmgr is no longer available to execute the glFlush implied by eglMakeCurrent.

This is the underlying reason why `./$test -fbo` crashes on Intel for each GLES1 test.

Here's the relevant spec language from EGL 1.4 spec (2011.04.06), Section
3.2 Initialization:

 *
 *     EGLBoolean eglTerminate(EGLDisplay dpy);
 *
 *     Termination marks all EGL-specific resources, such as contexts and
 *     surfaces, associated with the specified display for deletion. Handles
 *     to all such resources are invalid as soon as eglTerminate returns, but
 *     the dpy handle itself remains valid. [...] Applications should not try
 *     to perform useful work with such resources following eglTerminate; only
 *     eglMakeCurrent or eglReleaseThread should be called, to complete
 *     deletion of these resources.
 *
 *     If contexts or surfaces created with respect to dpy are current (see
 *     section 3.7.3) to any thread, then they are not actually destroyed
 *     while they remain current. Such contexts and surfaces will be destroyed
 *     as soon as eglReleaseThread is called from the thread they are bound
 *     to, or eglMakeCurrent is called from that thread with the current
 *     rendering API (see section 3.7) set such that the current context is
 *     affected. [...]
Comment 1 Chad Versace 2014-10-24 13:41:59 UTC
Bug is reproducible with the following:

  piglit test: egl-terminate-then-unbind-context [http://cgit.freedesktop.org/piglit/tree/tests/egl/spec/egl-1.4/egl-terminate-then-unbind-context.c?id=28505ec885274a0451f8ca831e12f5839ce79305]
  mesa@mesa-10.3.1
  piglit@28505ec8
  hw: CHIPSET(0x0166, ivb_gt2, "Intel(R) Ivybridge Mobile")

  
Backtrace
=========
#0  0x00007ffff5d330d7 in ?? ()
#1  0x00007ffff7a646d2 in dri2_make_current (drv=0x603db0, disp=0x603010, dsurf=0x0, rsurf=0x0, ctx=0x0)
    at /home/chadv/proj/hh/default/src/mesa/src/egl/drivers/dri2/egl_dri2.c:981
#2  0x00007ffff7a56e91 in eglMakeCurrent (dpy=0x603010, draw=0x0, read=0x0, ctx=0x0) at /home/chadv/proj/hh/default/src/mesa/src/egl/main/eglapi.c:542
#3  0x0000000000400e06 in main (argc=1, argv=0x7fffffffcec8)
    at /home/chadv/proj/hh/default/src/piglit/tests/egl/spec/egl-1.4/egl-terminate-then-unbind-context.c:105
Comment 2 Rhys Kidd 2015-06-20 02:45:46 UTC
Confirming bug still reproducible with the following:

  piglit test: egl-terminate-then-unbind-context
  piglit@c9f2ee5e75ce5a325dffcd764f989e1bb343340e
  mesa@afeb92220690c8f27cdc56c30e109ca175d51d83
  hw: CHIPSET(0x0046, "Intel(R) Ironlake Mobile")


Backtrace
=========
(gdb) bt
#0  0x00007ffff50e0150 in ?? ()
#1  0x00007ffff7bc39be in dri2_make_current (drv=0x6036a0, disp=0x603010, dsurf=0x0, rsurf=0x0, ctx=0x0)
    at ../../../../../mesa/src/egl/drivers/dri2/egl_dri2.c:1043
#2  0x00007ffff7bbef3d in eglMakeCurrent (dpy=0x603010, draw=0x0, read=0x0, ctx=0x0) at ../../../../mesa/src/egl/main/eglapi.c:705
#3  0x0000000000400d6b in main (argc=1, argv=0x7fffffffdee8)
    at /home/usera/Coding/piglit/tests/egl/spec/egl-1.4/egl-terminate-then-unbind-context.c:105


This failing test causes piglit to hang, i.e. unable to run through a full clean test set from the command line using this hardware, piglit and Git Mesa.
Comment 3 Rhys Kidd 2016-07-15 11:18:56 UTC
With Nicolas Boichat's patches [0][1] I now see a working piglit:

$ mesa-debug bin/egl-terminate-then-unbind-context -auto
PIGLIT: {"result": "pass" }

[0] https://patchwork.freedesktop.org/patch/98873/
[1] https://patchwork.freedesktop.org/patch/98874/
Comment 4 Rhys Kidd 2016-07-29 03:24:39 UTC
Resolved with:

commit 9ee683f877b283020c6f24776236f1145cb7a4ea
Author: Nicolas Boichat <drinkcat@chromium.org>
Date:   Fri Jul 22 11:27:41 2016 +0800

    egl/dri2: Add reference count for dri2_egl_display
    
    android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls
    eglTerminate, followed by eglReleaseThread. A similar case is
    observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622,
    where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL).
    
    With the current code, dri2_dpy structure is freed on eglTerminate
    call, so the display is not initialized when eglReleaseThread calls
    MakeCurrent with NULL parameters, to unbind the context, which
    causes a a segfault in drv->API.MakeCurrent (dri2_make_current),
    either in glFlush or in a latter call.
    
    eglTerminate specifies that "If contexts or surfaces associated
    with display is current to any thread, they are not released until
    they are no longer current as a result of eglMakeCurrent."
    
    However, to properly free the current context/surface (i.e., call
    glFlush, unbindContext, driDestroyContext), we still need the
    display vtbl (and possibly an active dri dpy connection). Therefore,
    we add some reference counter to dri2_egl_display, to make sure
    the structure is kept allocated as long as it is required.
    
    One drawback of this is that eglInitialize may not completely reinitialize
    the display (if eglTerminate was called with a current context), however,
    this seems to meet the EGL spec quite well, and does not permanently
    leak any context/display even for incorrectly written apps.
    
    Cc: "12.0" <mesa-stable@lists.freedesktop.org>
    Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
    Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
    Reviewed-by: Emil Velikov <emil.velikov@collabora.com>


bug/show.html.tmpl processed on Feb 28, 2017 at 05:31:27.
(provided by the Example extension).