Bug 99333

Summary: KDE's Plasma 5 freezes since server 1.19, reproduced widely.
Product: xorg Reporter: Samuel Verschelde <stormi>
Component: Server/GeneralAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: ajax, bastian.beischer, bodqhrohro, dbaryshkov, kairo, keithp, lordheavym, renda.krell
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Queue attended client to saved_ready_clients if server grab blocks execution none

Description Samuel Verschelde 2017-01-09 18:13:23 UTC
Since the update to xorg-server 1.19 in both Mageia 6 and Fedora 25, for many different users and various different drivers as can be seen in downstream bug report comments, plasma freezes. It happens right at splash screen or during normal use of the desktop.

When it does appear, switching to a virtual terminal with Ctrl+Alt+Fx then back to X unfreezes it (until the next freeze).

Downstream bug reports:
- KDE https://bugs.kde.org/show_bug.cgi?id=373427
- Mageia https://bugs.mageia.org/show_bug.cgi?id=19869
- Fedora https://bugzilla.redhat.com/show_bug.cgi?id=1399396

It's not easy to know where the bug lies and if it's xorg's fault at all, but it's definitely a change in xorg that triggered it so help in finding out what's going on would be greatly appreciated.

This particular issue is a release blocker for Mageia 6 which is due soon, and might be one for Fedora too. Has also been reproduced in OpenSUSE.

I can reproduce it myself and would be glad to give any debug information or test patches.
Comment 1 Arne Spiegelhauer 2017-01-09 20:01:28 UTC
I have done some debugging of this issue and found the following scenario as the cause of the freeze:

An X_DRI2GetBuffersWithFormat request gets throttled leading to the affected client being put on the sleep queue:
ProcDRI2GetBuffersWithFormat->DRI2ThrottleClient->dri2Sleep->ClientSleep

Later an attempt to wake the client fails:
dri2WakeClient->ClientWakeup->AttendClient->listen_to_client
because listen_to_client returns FALSE (GrabInProgress != 0) resulting in mark_client_ready never being called.

Consequently X_DRI2GetBuffersWithFormat never gets retried and the client (Plasma panel in this case) hangs forever waiting for the reply.

It would appear that there is a problem with the handling of GrabInProgress.
Comment 2 Michel Dänzer 2017-01-10 01:37:34 UTC
Promising analysis, Arne, thanks.

FWIW, the problem was bisected to commit f993091e7db8 ("os: Switch server to poll(2) [v3]") in the downstream report https://bugs.debian.org/846779 .
Comment 3 Keith Packard 2017-01-10 02:09:00 UTC
listen_to_client returns FALSE in this case because some other client has grabbed the X server. We should stick this client in the saved_ready_clients list in this case, but we don't have any function that does that available.

Patch coming in a moment.
Comment 4 Keith Packard 2017-01-10 02:16:08 UTC
Created attachment 128844 [details] [review]
Queue attended client to saved_ready_clients if server grab blocks execution
Comment 5 Arne Spiegelhauer 2017-01-10 07:05:10 UTC
(In reply to Keith Packard from comment #4)
> Created attachment 128844 [details] [review] [review]
> Queue attended client to saved_ready_clients if server grab blocks execution

Thanks,
I can no longer reproduce the freeze with this patch applied
Comment 6 Michel Dänzer 2017-01-12 01:54:07 UTC
Thanks for the report and testing, fixed in Git master and the 1.19.1 release:

commit 785053d033e73d2deb0ded4b97eabfd881991978
Author: Keith Packard <keithp@keithp.com>
Date:   Mon Jan 9 18:10:21 2017 -0800

    AttendClient of grab-pervious client must queue to saved_ready_clients [v2]
Comment 7 Michel Dänzer 2017-01-17 06:26:26 UTC
*** Bug 99433 has been marked as a duplicate of this bug. ***
Comment 8 Dmitry Eremin-Solenikov 2017-01-26 03:55:03 UTC
*** Bug 99544 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.