Bug 95337 - XWayland crashes during startup if output data is received
Summary: XWayland crashes during startup if output data is received
Status: RESOLVED FIXED
Alias: None
Product: Wayland
Classification: Unclassified
Component: XWayland (show other bugs)
Version: unspecified
Hardware: All All
: medium major
Assignee: Wayland bug list
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-10 14:34 UTC by Mike Blumenkrantz
Modified: 2016-06-14 06:53 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Possible fix (1.24 KB, patch)
2016-05-11 07:01 UTC, Olivier Fourdan
Details | Splinter Review

Description Mike Blumenkrantz 2016-05-10 14:34:25 UTC
Backtrace with v1.18.3

#0  0x00000000004dadc4 in RREditConnectionInfo (pScreen=0xaee390) at rrscreen.c:48
#1  RRScreenSizeNotify (pScreen=0xaee390) at rrscreen.c:151
#2  0x0000000000427146 in update_screen_size (xwl_output=0xaf3030, width=1920, height=1080)
    at xwayland-output.c:189
#3  0x00007ff9e05fcd30 in ffi_call_unix64 () at ../src/x86/unix64.S:76
#4  0x00007ff9e05fc79b in ffi_call (cif=<optimized out>, fn=<optimized out>, rvalue=0x0, avalue=<optimized out>)
    at ../src/x86/ffi64.c:525
#5  0x00007ff9e2c41af2 in wl_closure_invoke (closure=0xdef360, flags=1, target=0xaf3090, opcode=2, data=0xaf3030)
    at src/connection.c:949
#6  0x00007ff9e2c3f05b in dispatch_event (display=0xaeeb50, queue=0xaeec18) at src/wayland-client.c:1288
#7  0x00007ff9e2c3f347 in dispatch_queue (display=0xaeeb50, queue=0xaeec18) at src/wayland-client.c:1434
#8  0x00007ff9e2c3f5fa in wl_display_dispatch_queue_pending (display=0xaeeb50, queue=0xaeec18)
    at src/wayland-client.c:1676
#9  0x00007ff9e2c3f5c3 in wl_display_dispatch_queue (display=0xaeeb50, queue=0xaeec18)
    at src/wayland-client.c:1652
#10 0x00007ff9e2c3eb0c in wl_display_roundtrip_queue (display=0xaeeb50, queue=0xaeec18)
    at src/wayland-client.c:1099
#11 0x00007ff9e2c3eb60 in wl_display_roundtrip (display=0xaeeb50) at src/wayland-client.c:1128
#12 0x000000000042671c in InitInput (argc=argc@entry=9, argv=argv@entry=0x7ffcbeb60728) at xwayland-input.c:900
#13 0x0000000000553e51 in dix_main (argc=9, argv=0x7ffcbeb60728, envp=<optimized out>) at main.c:268
#14 0x00007ff9e0ca5580 in __libc_start_main (main=0x4237c0 <main>, argc=9, argv=0x7ffcbeb60728, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcbeb60718)
    at libc-start.c:289
#15 0x00000000004237f9 in _start ()


ConnectionInfo is NULL, as InitInput is called before the connection data is initialized, meaning that anything trying to access it before this point will trigger a crash.
Comment 1 Olivier Fourdan 2016-05-11 07:01:35 UTC
Created attachment 123617 [details] [review]
Possible fix

(In reply to Mike Blumenkrantz from comment #0)
> [...]
> ConnectionInfo is NULL, as InitInput is called before the connection data is
> initialized, meaning that anything trying to access it before this point
> will trigger a crash.

Huh...

I am not sure, why we'd need to call the wl_display_roundtrip() in InitInput()?

This and initializing xwl_screen->expecting_event to 0 will be done again later again in xwl_screen_init() where it seems more appropriate, and events can safely be queued up meanwhile.

Does this patch help? I haven't detected any ill effect in my light initial testing.
Comment 2 Jonas Ådahl 2016-05-11 07:34:46 UTC
(In reply to Olivier Fourdan from comment #1)
> Created attachment 123617 [details] [review] [review]
> Possible fix
> 
> (In reply to Mike Blumenkrantz from comment #0)
> > [...]
> > ConnectionInfo is NULL, as InitInput is called before the connection data is
> > initialized, meaning that anything trying to access it before this point
> > will trigger a crash.
> 
> Huh...
> 
> I am not sure, why we'd need to call the wl_display_roundtrip() in
> InitInput()?
> 
> This and initializing xwl_screen->expecting_event to 0 will be done again
> later again in xwl_screen_init() where it seems more appropriate, and events
> can safely be queued up meanwhile.

I think that if we don't roundtrip here, we might end up in a situation where various places expects to be able to know things about wl_seat's etc, that'd now start to crash if we loose the race against the wl_seat global being discovered.

AFAICS InitInput() is done after InitOutput() which is where xwl_screen_init and the first roundtrips are done.

> 
> Does this patch help? I haven't detected any ill effect in my light initial
> testing.

I wonder if it'd be better to move the initialization roundtrip to after CreateConnectionBlock(), while adapting everything that is in response to the global object discovery to be able to wait until that stage. I don't know what would break, but at least I think we need to either make sure Xwayland can deal with working without any seat, or ensure we have discovered and created all the seat objects before finishing initialization.
Comment 3 Pekka Paalanen 2016-05-11 08:01:54 UTC
Mind, wl_seats can come and go dynamically, and it is not guaranteed to have at least one at all times. So, working without any seat would sound like a good idea to me.

The same goes for outputs, and will likely be testable on Weston once Armin's GSoC project is near finishing.
Comment 4 Olivier Fourdan 2016-05-12 07:54:08 UTC
(In reply to Jonas Ådahl from comment #2)
> [...]
> I think that if we don't roundtrip here, we might end up in a situation
> where various places expects to be able to know things about wl_seat's etc,
> that'd now start to crash if we loose the race against the wl_seat global
> being discovered.

I don't experience any crash with this patch applied, neither with mutter which spawns Xwayland at startup as it (still) relies on X, i.e. very early, nor with weston which spawns it when needed (ie with all outputs and seats set-up).
 
> AFAICS InitInput() is done after InitOutput() which is where xwl_screen_init
> and the first roundtrips are done.

You're right, of course, but the problem is not much xwl_screen_init() but the X ConnectionInfo (as reported in comment #0) that gets initialized after InitInput is called in dix.

It goes like this:

  xwl_screen_init()
  InitInput()
  ConnectionInfo initialized

So if the roundtrip in InitInput() trigger something that needs to use the X connection (as with RR in the backtrace), it will crash.

Meanwhile, ajax picked up the patch from the ML and pushed it, so it's landed in master (I did send the patch to the devel ML in parallel as this is where patches get discussed for xorg rather than bugzilla).

But I am not convinced this patch would cause more bad that good, initializing xwl_screen->expecting_event to 0 twice at two different places doesn't sound right and doing the rountrip before X connection is likely to cause breakage as well, as seen in comment #0.

Let's keep that bug opened and see if we get some more feedback (either here or on the xorg-devel ML), meanwhile please test master with different compositors and use cases and let's see if this a real problem.

(In reply to Pekka Paalanen from comment #3)
> Mind, wl_seats can come and go dynamically, and it is not guaranteed to have
> at least one at all times. So, working without any seat would sound like a
> good idea to me.
> 
> The same goes for outputs, and will likely be testable on Weston once
> Armin's GSoC project is near finishing.

Yes, probably, but I reckon outputs are a different issue than this particular crash.
Comment 5 Olivier Fourdan 2016-05-12 07:56:40 UTC
(In reply to Olivier Fourdan from comment #4)
> You're right, of course, but the problem is not much xwl_screen_init() but
> the X ConnectionInfo (as reported in comment #0) that gets initialized after
> InitInput is called in dix.
> 
> It goes like this:
> 
>   xwl_screen_init()
>   InitInput()
>   ConnectionInfo initialized
> 
> So if the roundtrip in InitInput() trigger something that needs to use the X
> connection (as with RR in the backtrace), it will crash.

Humm, now that I re-read my own sentence, I realize there is still an obvious possibility of breakeage because the rountrips are done in xwl_screen_init(), so *before* ConnectionInfo is initialized
Comment 6 Olivier Fourdan 2016-06-14 06:53:49 UTC
Patches have landed in master, closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.