Bug 60942

Summary: Threaded wayland EGL clients stall
Product: Wayland Reporter: Scott Moreau <oreaus>
Component: waylandAssignee: Wayland bug list <wayland-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Patch to make simple-egl use a separate thread for rendering
Backtrace from gdb with simple-egl threading patch applied

Description Scott Moreau 2013-02-16 05:09:21 UTC
Created attachment 74924 [details]
Patch to make simple-egl use a separate thread for rendering

I am trying to port a threaded client to wayland but I have run across what I believe to be a bug. The scenario is:

1) Client starts a rendering thread that handles EGL init
2) Client starts a main loop back in the main thread that calls wl_display_dispatch()

Expected: EGL calls return and succeed

Actual: EGL calls stall and to do not return


After toying with this a bit, I decided to try making weston's simple-egl threaded to see what happens. I noticed the same problem. So, I am posting a patch to have simple-egl run egl setup in a different thread and the backtrace I get from gdb. I'm not sure if this is a bug in wayland or my code.

I am running latest wayland/mesa/weston stack from git. OS is x86_64. Intel Sandybridge GPU. Kernel 3.5.0-24-generic.

Thanks.
Comment 1 Scott Moreau 2013-02-16 05:11:48 UTC
Created attachment 74925 [details]
Backtrace from gdb with simple-egl threading patch applied
Comment 2 Rob Bradford 2013-07-08 11:48:41 UTC
I think this should be resolved with the changes that have been made following this commit (and the appropriate changes in Mesa):


commit 3c7e8bfbb4745315b7bcbf69fa746c3d6718c305
Author: Kristian Høgsberg <krh@bitplanet.net>
Date:   Sun Mar 17 14:21:48 2013 -0400

    client: Add wl_display_prepare_read() API to relax thread model assumptions
    
    The current thread model assumes that the application or toolkit will have
    one thread that either polls the display fd and dispatches events or just
    dispatches in a loop.  Only this main thread will read from the fd while
    all other threads will block on a pthread condition and expect the main
    thread to deliver events to them.
    
    This turns out to be too restrictive.  We can't assume that there
    always will be a thread like that.  Qt QML threaded rendering will
    block the main thread on a condition that's signaled by a rendering
    thread after it finishes rendering.  This leads to a deadlock when the
    rendering threads blocks in eglSwapBuffers(), and the main thread is
    waiting on the condition.  Another problematic use case is with games
    that has a rendering thread for a splash screen while the main thread
    is busy loading game data or compiling shaders.  The main thread isn't
    responsive and ends up blocking eglSwapBuffers() in the rendering thread.
    
    We also can't assume that there will be only one thread polling on the
    file descriptor.  A valid use case is a thread receiving data from a
    custom wayland interface as well as a device fd or network socket.
    The thread may want to wait on either events from the wayland
    interface or data from the fd, in which case it needs to poll on both
    the wayland display fd and the device/network fd.
    
    The solution seems pretty straightforward: just let all threads read
    from the fd.  However, the main-thread restriction was introduced to
    avoid a race.  Simplified, main loops will do something like this:
    
        wl_display_dispatch_pending(display);
    
        /* Race here if other thread reads from fd and places events
         * in main eent queue.  We go to sleep in poll while sitting on
         * events that may stall the application if not dispatched. */
    
        poll(fds, nfds, -1);
    
        /* Race here if other thread reads and doesn't queue any
         * events for main queue. wl_display_dispatch() below will block
         * trying to read from the fd, while other fds in the mainloop
         * are ignored. */
    
        wl_display_dispatch(display);
    
    The restriction that only the main thread can read from the fd avoids
    these races, but has the problems described above.
    
    This patch introduces new API to solve both problems.  We add
    
        int wl_display_prepare_read(struct wl_display *display);
    
    and
    
        int wl_display_read_events(struct wl_display *display);
    
    wl_display_prepare_read() registers the calling thread as a potential
    reader of events.  Once data is available on the fd, all reader
    threads must call wl_display_read_events(), at which point one of the
    threads will read from the fd and distribute the events to event
    queues.  When that is done, all threads return from
    wl_display_read_events().
    
    From the point of view of a single thread, this ensures that between
    calling wl_display_prepare_read() and wl_display_read_events(), no
    other thread will read from the fd and queue events in its event
    queue.  This avoids the race conditions described above, and we avoid
    relying on any one thread to be available to read events.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.