Bug 100040 - Skype crashes XWayland which kills GNOME session
Summary: Skype crashes XWayland which kills GNOME session
Status: RESOLVED FIXED
Alias: None
Product: Wayland
Classification: Unclassified
Component: XWayland (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Wayland bug list
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 99149 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-03-03 01:14 UTC by Mark B
Modified: 2017-08-05 19:53 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Mark B 2017-03-03 01:14:09 UTC
Skype updated from Alpha to Beta version 5.0.0.5-1 yesterday and now I get semi-frequent XWayland crashes when opening or re-opening the skype window. This kills my entire session. I have run skype alpha for ages without any issue like this.

See crash log from journal at https://community.skype.com/t5/Linux/Skype-5-0-for-Linux-Beta-is-here/m-p/4610798/highlight/true#M17800

I know this is a Skype issue but fundamentally a bad client should not crash my session so there must be a bug in XWayland. I am on up to date Arch Linux (Wayland version 1.13.0-1).
Comment 1 Michel Dänzer 2017-03-03 01:55:52 UTC
Looks like https://patchwork.freedesktop.org/patch/141671/ might fix this.

P.S. In the future, please attach all relevant information here directly instead of referencing an external site.
Comment 2 Mark B 2017-03-03 06:04:39 UTC
I downloaded the Arch xorg-server sources (1.19.1-5), applied that patch, and am now running with the newly built package. I had 4 crashes in previous 36 hours so this will prove that fix in a couple of days. I will let you know here.
Comment 3 Mark B 2017-03-04 03:11:29 UTC
OK, I am 100% sure that the patch fixes this crash. I tested again today and found that, without the patch, I can cause the crash just by opening and closing the skype window repeatedly. It happens about one in six times. It never happens with the patch enabled. In fact, I also later added an ErrorF() message for the else ClientGone case and I do occasionally see that message in my journal when repeating this test, just to prove that it really is catching the fault condition.

I think this is a priority bug. Skype call that a "Beta" version but it is the current official skype client so this crash affects everybody who uses GNOME on Wayland with skype.
Comment 4 Olivier Fourdan 2017-03-06 08:10:56 UTC
(In reply to Mark B from comment #3)
> OK, I am 100% sure that the patch fixes this crash. I tested again today and
> found that, without the patch, I can cause the crash just by opening and
> closing the skype window repeatedly.

Brilliant, many thanks for testing the patch!

I'll add your "tested-by" if you don't mind.
Comment 5 Mark B 2017-03-06 12:15:32 UTC
Yes, certainly add me as the "tested by".
Comment 6 Olivier Fourdan 2017-03-08 09:57:23 UTC
The patch has landed in git master, but after discussing with Michel on irc, we came up with a more complete (and possibly robust) solution by monitoring client states and cancelling the sync callback when the client goes away, ie. being proactive :)

Long story short, would you mind testing this patch as well on your setup:

https://patchwork.freedesktop.org/patch/142828/

This applies on top of the previous one...
Comment 7 Mark B 2017-03-09 00:40:34 UTC
I applied the new patch and confirmed it fixes this issue. I actually switched back and forth a few times between my master and patch branches to confirm the problem was still there and found it actually occurs about 1 in 15 times with me opening and closing (alt+f4) the skype window repeatedly. I did it a hundred or so times with the new patch branch and the fault never occurs.

When I look at the new patch I see that the clientGone test has been removed from sync_callback() compared to the original patch. That's good to prove the new changes still address the issue but personally I would be more at ease if that clientGone check was still included for the final release patch. It's a one-liner check, costs nothing, and assures ultimate protection.
Comment 8 Michel Dänzer 2017-03-09 00:48:45 UTC
(In reply to Mark B from comment #7)
> When I look at the new patch I see that the clientGone test has been removed
> from sync_callback() compared to the original patch. That's good to prove
> the new changes still address the issue but personally I would be more at
> ease if that clientGone check was still included for the final release
> patch. It's a one-liner check, costs nothing, and assures ultimate
> protection.

Not really. The check could only make any difference if xwl_auth_state_client_callback fails to cancel the sync callback before the client is destroyed. In which case, the memory pointed to by state->client may already have been freed or reclaimed for a different client, so the client->clientGone value cannot be relied on anymore.
Comment 9 Mark B 2017-03-09 00:58:28 UTC
Ok, I was just worried about any small race/time window where the callback could still get called but if the client data structure is invalid by then - well I am glad you made a new patch!
Comment 10 Olivier Fourdan 2017-03-09 07:35:20 UTC
(In reply to Mark B from comment #7)
> I applied the new patch and confirmed it fixes this issue. I actually
> switched back and forth a few times between my master and patch branches to
> confirm the problem was still there and found it actually occurs about 1 in
> 15 times with me opening and closing (alt+f4) the skype window repeatedly. I
> did it a hundred or so times with the new patch branch and the fault never
> occurs.

Right, so I guess you're OK with me adding your "tested-by" on this now patch as well?
Comment 11 Mark B 2017-03-09 07:39:24 UTC
yes, of course.
Comment 12 Olivier Fourdan 2017-03-09 08:45:52 UTC
Patch has landed in git master.
Comment 13 Olivier Fourdan 2017-03-10 13:29:12 UTC
*** Bug 99149 has been marked as a duplicate of this bug. ***
Comment 14 Leho Kraav (:macmaN :lkraav) 2017-03-13 11:40:01 UTC
Thanks for working on this. I see the patch hasn't been pulled to `server-1.19-branch` yet, is it going to be?
Comment 15 Olivier Fourdan 2017-03-13 11:42:04 UTC
(In reply to Leho Kraav (:macmaN :lkraav) from comment #14)
> Thanks for working on this. I see the patch hasn't been pulled to
> `server-1.19-branch` yet, is it going to be?

We'll see, I sent a pull request earlier this morning:

https://lists.x.org/archives/xorg-devel/2017-March/053050.html
Comment 16 Leho Kraav (:macmaN :lkraav) 2017-03-13 12:16:01 UTC
(In reply to Olivier Fourdan from comment #15)
> (In reply to Leho Kraav (:macmaN :lkraav) from comment #14)
> > Thanks for working on this. I see the patch hasn't been pulled to
> > `server-1.19-branch` yet, is it going to be?
> 
> We'll see, I sent a pull request earlier this morning:

Thanks, looks good. I've just applied the following patch stack on top of 1.19.2

0001-xwayland-make-sure-client-is-not-gone-in-sync-callba.patch
0002-xwayland-clear-cursor-frame-callback.patch
0003-xwayland-Monitor-client-states-to-destroy-callbacks.patch

Let's see how it holds up in workload.

0001-xwayland-Make-sure-we-have-a-focus-window.patch also looked interesting, but this failed to apply cleanly, so I skipped it for now.
Comment 17 Olivier Fourdan 2017-03-13 12:24:17 UTC
(In reply to Leho Kraav (:macmaN :lkraav) from comment #16)
> [...]
> 
> Let's see how it holds up in workload.

afaik this is a race more than a matter or workload.

> 0001-xwayland-Make-sure-we-have-a-focus-window.patch also looked
> interesting, but this failed to apply cleanly, so I skipped it for now.

Maybe because this is already in the xserver-1.19-branch?

https://cgit.freedesktop.org/xorg/xserver/commit/?h=server-1.19-branch&id=c84f5c3


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.