A whack-a-mole session revealed a couple of bugs in Wocky:
• The porter's internal flag to remind it it's already closing up was not set until after stream errors were reported, which lead to inconsistencies if the signals' callbacks called back into the porter;
• The checks to avoid the porter repeatedly telling the connection to forcibly disconnect were inverted.
I have a WIP branch at <http://git.collabora.co.uk/?p=user/wjt/wocky.git;a=shortlog;h=refs/heads/disconnection-woes>.
Further invetigation indicates that the tests aren't quite as inverted as
we thought they were originally. They're _sometimes_ inverted. So it looks
like we need to track our state a little more carefully in wocky, and we
may be trying to cram a tri-state's worth of information into two bi-states and getting it wrong.
The flag-being-set-after-signals-fired was definitely a bug though.
Ok, so, this is the root cause of the undead-dbus-names bug:
• The flag in the wocky porter saying it was in shutdown was set _after_ the remote-* signals were fired.
• Ths meant a callback inside wocky did not get called, so the name was never cleared from the bus (even though the object was correctly unref()d).
• The tests for the corce close result were not inverted, their purpose was not to prevent a second force close, but to prevent a callback being set up when there was no result to hold its response (ie to avoid an _unexpected_ force close (which shouldn't happen anyway)).
• The porter can only support one forced close shutdown operation in flight at a time
• It did not check for already-in-shutdown before attempting a forced shutdown
• When the async stanza_received_cb was triggered and failed to receive a stanza it tried to start a second force close shutdown
• This caused the XMPP connection to report an error in idle with the porter as its user_data
This caused two problems:
• The porter had already been unref()d when the idle error was reported, instant death when we tried to cast the user data back to a porter
• The async result had already been used and cleared by the first force close shutdown op, so the callback to the second force close op had no result to use to report back to us.
So, the fixes are:
• flip the closing flag early (before the remote-* signals)
• add a flag to indicate when a forced close has been started, cope gracefully when a force-close op is already in flight and we request a second one
• ref the porter passed as user data to the force-close callback
• unref the porter in the force close callback
• report errors (in idle if appropriate) when a second force-close is attempted
Merged to master, wocky submodule synced, all tests (including new one for this problem) pass.