Bug 73599

Summary: Shutdown delay in GNOME caused by inhibitor introduced in mission-control 5.15.1
Product: Telepathy Reporter: Boyan Ding <stu_dby>
Component: mission-controlAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium CC: suraia
Version: git master   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Release inhibitor on sleep and shutdown

Description Boyan Ding 2014-01-14 09:18:52 UTC
Solution to bug 68758 (https://bugs.freedesktop.org/show_bug.cgi?id=68758) introduced an delay inhibitor for systemd, making IM clients to disconnect before shutdown or sleep. However, the solution shipped in mission-control 5.15.1 (and the following 5.16.0) seems to cause trouble with gnome-shell (see https://bugzilla.gnome.org/show_bug.cgi?id=721605), making gnome-shell take the inhibitor upon start and never release.
This issue causes gnome-shell to stay on desktop for 5 second after hitting the shutdown button, which can be spotted in systems with newest software such as Fedora 20. The present workaround is to downgrade mission-control to versions prior to 5.14.0.
I hope some correction to be taken to eliminate the problem in future versions, or  all systems with gnome will be affected if the software is updated.
Comment 1 Michael Kuhn 2014-01-21 23:02:23 UTC
Debugging this a bit further, it seems that connectivity_monitor_dispose() is never called when shutting down the computer (or at least not soon enough). Consequently, mcd_inhibit_release() is never called and the file descriptor never closed; this causes the observed timeout.

If mission-control is killed before shutting down the computer, there is no delay because the inhibit is correctly released.
Comment 2 Michael Kuhn 2014-01-22 09:18:48 UTC
Created attachment 92573 [details] [review]
Release inhibitor on sleep and shutdown

The underlying problem seems to be that the inhibitor is not released on sleep or shutdown; renewal on wake/cancel-shutdown is attempted, though.

The attached patch fixes the problem by explicitly releasing it on sleep/shutdown and works correctly for me: The different accounts increment the reference counter when the signal is emitted and decrease it when disconnecting.

The patch should also apply without a problem to the 5.16 branch.
Comment 3 Simon McVittie 2014-01-22 11:19:26 UTC
The intended functionality is:

* when we're told we're sleeping or shutting down, we start trying to disconnect all active connections
* when all active connections have returned from their async Disconnect method (or in particular, immediately if there are no active connections), we release the inhibitor

As far as I can tell, you're changing this to:

* when we're told we're sleeping or shutting down, we start trying to disconnect all active connections (e.g. send "</stream:stream>" on XMPP)
* without waiting for that, we release the inhibitor
* if the user is unlucky, the system will sleep or shut down before the "</stream:stream>" (or whatever) has reached the server, and the server still thinks they're connected

which is not what we want?

From the log on the GNOME bug,

(process:4846): mcd-DEBUG: monitor_state_changed_cb: account idle/irc/SaadM0 must disconnect
(process:4846): mcd-DEBUG: monitor_state_changed_cb: account idle/irc/SaadM1 must disconnect
(process:4846): mcd-DEBUG: monitor_state_changed_cb: account gabble/jabber/saamalik_40gmail_2ecom0 must disconnect

Unfortunately, MC doesn't currently log the beginning and end of the Disconnect() call. Perhaps you could try this with a "disposable" Gabble account, and get the corresponding Gabble log too?

One possible refinement is: here is the life-cycle of a connection:

[0]
MC -> connection manager: RequestConnection
[1]
MC <- connection manager: RequestConnection reply
[2]
MC <-> connection manager: (some setup)
[2]
MC -> connection manager: Connect
[3]
MC <- connection manager: Connect reply
[3]

If we're in state [0], MC should release the inhibitor immediately: simple.

If we're in state [3], to make a best-effort attempt to disconnect from the server, we have to call Disconnect, and wait for it to finish before releasing the inhibitor.

If we're in state [2], we can call Disconnect to discard the connection, but we don't *have* to do that before suspending/shutting down; so we could call Disconnect on it, but release the inhibitor immediately without waiting.

If we're in state [1], we can't even call Disconnect until we reach state [2] (we don't have anything to disconnect yet!), but once again, we can release the inhibitor without waiting.
Comment 4 Michael Kuhn 2014-01-22 11:35:47 UTC
(In reply to comment #3)
> The intended functionality is:
> 
> * when we're told we're sleeping or shutting down, we start trying to
> disconnect all active connections
> * when all active connections have returned from their async Disconnect
> method (or in particular, immediately if there are no active connections),
> we release the inhibitor
> 
> As far as I can tell, you're changing this to:
> 
> * when we're told we're sleeping or shutting down, we start trying to
> disconnect all active connections (e.g. send "</stream:stream>" on XMPP)
> * without waiting for that, we release the inhibitor
> * if the user is unlucky, the system will sleep or shut down before the
> "</stream:stream>" (or whatever) has reached the server, and the server
> still thinks they're connected
> 
> which is not what we want?

As far as I can tell, it still works like this. The inhibitor is ref counted, that is, the release before sleeping/shutting down only decrements the ref count. However, the connections still hold references:

Upon disconnect, the state-change signal is emitted with the inhibitor attached to it; this is processed in mcd-account.c and trickles down to mcd-connection.c, which "holds" the inhibitor (that is, increases the ref count) and calls tp_cli_connection_call_disconnect(). When disconnect is finished, the inhibitor is "released" (that is, the ref count is decremented again).

The logs show that the last connection to disconnect decreases the ref count to 0 and actually releases the inhibitor.

Sorry for not being more clear about this. :-)
Comment 5 Simon McVittie 2014-01-22 11:41:14 UTC
Ah, OK: so the connections correctly hold and release references, but the inhibit machinery incorrectly holds 1 extra reference?

It's meant to hold 1 reference *while notifying connections*, then release it, leaving only whatever references are held by the connections.
Comment 6 Simon McVittie 2014-01-22 11:58:16 UTC
Fixed in git for 5.16.1 and 5.17.0, thanks. I added a bit more explanation to the commit message.
Comment 7 Boyan Ding 2014-01-23 02:03:55 UTC
Thanks for the really quick action. It works on my system. Looking forward to the next release of MC.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.