Bug 79684 - subsurfaces: crash when destroying nested subsurfaces
Summary: subsurfaces: crash when destroying nested subsurfaces
Status: RESOLVED FIXED
Alias: None
Product: Wayland
Classification: Unclassified
Component: weston (show other bugs)
Version: 1.5.0
Hardware: Other All
: medium normal
Assignee: Wayland bug list
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-05 15:08 UTC by Arnaud Vrac
Modified: 2014-06-25 12:40 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
simple program that creates many nested subsurfaces (6.05 KB, text/plain)
2014-06-05 15:08 UTC, Arnaud Vrac
Details
debut output for subsurfaces creation/destruction (76.93 KB, text/plain)
2014-06-10 10:04 UTC, Arnaud Vrac
Details
A simpler program that can control the depth of nested surfaces (5.70 KB, text/plain)
2014-06-10 12:05 UTC, Boyan Ding
Details
proposed patch (1.04 KB, patch)
2014-06-11 08:48 UTC, George Kiagiadakis
Details | Splinter Review
proposed patch (1.49 KB, patch)
2014-06-13 12:50 UTC, Arnaud Vrac
Details | Splinter Review

Description Arnaud Vrac 2014-06-05 15:08:38 UTC
Created attachment 100467 [details]
simple program that creates many nested subsurfaces

The simple attached program makes weston crash when it exits (press ctrl-c). This seems to be happening when the top subsurface of nested subsurfaces is destroyed.

compile with gcc -lwayland-client -D_GNU_SOURCE subsurfaces.c -o subsurfaces-test
Comment 1 Pekka Paalanen 2014-06-06 11:48:24 UTC
Ooh, that is an excellent test case indeed. Thank you for the report, I'll try to get a look at it some day, unless someone else beats me to it.
Comment 2 George Kiagiadakis 2014-06-09 17:18:13 UTC
I had a look at this bug. The backtrace looks like:

#0  0x00007ffff7bd7b47 in wl_list_remove (elm=elm@entry=0xbbc1a8) at src/wayland-util.c:54
#1  0x000000000040cba0 in weston_view_destroy (view=0xbbc180) at src/compositor.c:1711
#2  0x000000000040cc6b in surface_free_unused_subsurface_views (surface=0xac6610) at src/compositor.c:2018
#3  0x000000000040cc37 in surface_free_unused_subsurface_views (surface=0xac5da0) at src/compositor.c:2015
#4  0x000000000040ce20 in weston_compositor_build_view_list (compositor=0x63a250) at src/compositor.c:2104
#5  0x000000000040cb97 in weston_view_destroy (view=0xbbe280) at src/compositor.c:1708
#6  0x000000000040cc6b in surface_free_unused_subsurface_views (surface=0xacd4f0) at src/compositor.c:2018
#7  0x000000000040cc37 in surface_free_unused_subsurface_views (surface=0xac6610) at src/compositor.c:2015
#8  0x000000000040cc37 in surface_free_unused_subsurface_views (surface=0xac5da0) at src/compositor.c:2015
#9  0x000000000040ce20 in weston_compositor_build_view_list (compositor=0x63a250) at src/compositor.c:2104
#10 0x000000000040cb97 in weston_view_destroy (view=0xbbeac0) at src/compositor.c:1708

and continues for ~500 frames. It looks to me as if weston_compositor_build_view_list() is not meant to be called recursively, as it operates on the same object (compositor) all the time and just corrupts the lists on which it works on.

I did a quick hack that works, but it's probably not the best idea:

diff --git a/src/compositor.c b/src/compositor.c                                                                             
index 2c33297..6691e49 100644
--- a/src/compositor.c
+++ b/src/compositor.c
@@ -1699,13 +1699,18 @@ weston_surface_discard_queue(struct weston_surface *surface)
 WL_EXPORT void
 weston_view_destroy(struct weston_view *view)
 {
+       static int recursing = 0;
+       recursing++;
+
        wl_signal_emit(&view->destroy_signal, view);
 
        assert(wl_list_empty(&view->geometry.child_list));
 
        if (weston_view_is_mapped(view)) {
                weston_view_unmap(view);
-               weston_compositor_build_view_list(view->surface->compositor);
+               if (recursing == 1)
+                       weston_compositor_build_view_list(view->surface->compositor);
        }
 
        wl_list_remove(&view->link);
@@ -1719,6 +1724,8 @@ weston_view_destroy(struct weston_view *view)
        wl_list_remove(&view->surface_link);
 
        free(view);
+       recursing--;
 }
 
 WL_EXPORT void
Comment 3 Boyan Ding 2014-06-10 02:56:59 UTC
I can verify the hack does eliminate the symptom.

But it seems that the reason why weston crashes isn't stack overflow caused by infinite recursion but some other reasons. If you change the width and height limit from 4 to 64 (on line 176) weston will still crash and the depth of frames is less than 30.
Comment 4 George Kiagiadakis 2014-06-10 09:27:18 UTC
(In reply to comment #3)
> But it seems that the reason why weston crashes isn't stack overflow caused
> by infinite recursion but some other reasons. If you change the width and
> height limit from 4 to 64 (on line 176) weston will still crash and the
> depth of frames is less than 30.

No, it's not stack overflow. It's a list corruption, because the recursive calls of the same method loop over the same lists and delete elements.
Comment 5 vivek 2014-06-10 09:37:51 UTC
I checked by changing the value from 4 to 64. Its not crashing.
Comment 6 Arnaud Vrac 2014-06-10 10:04:23 UTC
Created attachment 100794 [details]
debut output for subsurfaces creation/destruction

I have added some debug prints in weston and I get the attached trace, which might help debug this issue.
Comment 7 Boyan Ding 2014-06-10 12:05:33 UTC
Created attachment 100804 [details]
A simpler program that can control the depth of nested surfaces

Here is a program similar to the original one but produces considerably less subsurfaces, and can control the nesting depth.

with the same flags specified above

./subsurfaces [nesting depth]

weston will crash when depth>=2 on my machine.
Comment 8 George Kiagiadakis 2014-06-11 08:48:55 UTC
Created attachment 100869 [details] [review]
proposed patch

This patch should solve the issue more properly.
Comment 9 Arnaud Vrac 2014-06-13 12:50:33 UTC
Created attachment 100977 [details] [review]
proposed patch

I still get valgrind warnings and crashes with the proposed patch, one unmap call was missing in weston_subsubsurface_destroy(). This is an updated patch which fixes all valgrind warnings and crashes for me.
Comment 10 Pekka Paalanen 2014-06-25 12:40:53 UTC
I see the patch was already merged into Weston master.
Assuming it is fixed now.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.