Bug 55726

Summary: spice-server wrongly disconnects client
Product: Spice Reporter: Hans de Goede <jwrdegoede>
Component: serverAssignee: Spice Bug List <spice-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Hans de Goede 2012-10-07 13:13:14 UTC
When the agent inside the guest goes away (crash, or whatever) then the client will get notified of this, but the client may try to send agent data between the agent going away, and the client processing this message. Currently when we hit this race, spice-server prints: "ERROR: channel refused to allocate buffer." and then proceeds to disconnects the client. This is clearly wrong behavior. Better behavior would be for the server to drop the message (possibly with a warning) and let the client stick around.
Comment 1 Alon Levy 2012-10-07 13:24:49 UTC
Yep, I've hit this too, thanks for reporting, this is exactly what I've seen. I wasn't sure how to reproduce this. So it seems we don't track the existence of the agent correctly.
Comment 2 Alon Levy 2012-10-07 13:28:29 UTC
(In reply to comment #1)
> Yep, I've hit this too, thanks for reporting, this is exactly what I've
> seen. I wasn't sure how to reproduce this. So it seems we don't track the
> existence of the agent correctly.

Thinking about it a bit more, technically the client is doing something wrong - it's violating the token scheme. So this could possibly be something to fix in the client as well. To add a bit more data that I forgot, when I've hit this I was able to connect with a new spicec client and not have a disconnection, but not with a remote-viewer (or spicy) client. Clearly the former wasn't sending any agent messages, and the later was - I'll hazard to conclude the message being sent was the AgentMonitorsConfig one. Regardless either the server is not sending a tokens message (bug - it should always do that, agent or not), or the client is not tracking them correctly (bug).
Comment 3 Hans de Goede 2012-10-08 07:36:28 UTC
Hi,

(In reply to comment #2)
> Thinking about it a bit more, technically the client is doing something
> wrong - it's violating the token scheme. 

No it is not, it is perfectly possible for the client to have a token for the agent channel. IE it has been a while since it has last send an agent message) and to then decide to send an agent message, IE a clipboard related message. If while this message is "in transit" the guest agent then goes away, the server will send a message to the client to let it know the agent is gone, but as the agent message is already zipping along the network, there is nothing the client can do to stop that message from arriving! But since the server
is in "no agent" state it not only rejects the message because of the "ERROR: channel refused to allocate buffer." error, it also *disconnects* the client as if the client has done something wrong, but as explained there is nothing the client can do here, so the disconnecting it is wrong!

> So this could possibly be something
> to fix in the client as well. To add a bit more data that I forgot, when
> I've hit this I was able to connect with a new spicec client and not have a
> disconnection, but not with a remote-viewer (or spicy) client. Clearly the
> former wasn't sending any agent messages, and the later was - I'll hazard to
> conclude the message being sent was the AgentMonitorsConfig one. Regardless
> either the server is not sending a tokens message (bug - it should always do
> that, agent or not), or the client is not tracking them correctly (bug).

This is not what I'm seeing, what I'm seeing is:
1) repeated guest agent opening / closing of the virtio serial port, caused by using an agent which does not yet have this fix: http://cgit.freedesktop.org/spice/linux/vd_agent/commit/?id=9a58d8ee70c13677a1b62a2c8af694829c7afec5

2) That triggering the race I described above (it could even be the agent hello message which is triggering this).

After this has happened, simply re-connecting with remote-viewer works fine, iow not the same as what you're seeing.

Regards,

Hans
Comment 4 Alon Levy 2012-10-08 16:40:16 UTC
> This is not what I'm seeing, what I'm seeing is:
> 1) repeated guest agent opening / closing of the virtio serial port, caused
> by using an agent which does not yet have this fix:
> http://cgit.freedesktop.org/spice/linux/vd_agent/commit/
> ?id=9a58d8ee70c13677a1b62a2c8af694829c7afec5
> 
> 2) That triggering the race I described above (it could even be the agent
> hello message which is triggering this).
> 
> After this has happened, simply re-connecting with remote-viewer works fine,
> iow not the same as what you're seeing.

OK, I agree it is not what I am seeing. The "channel refused to allocate buffer" on main channel happens only for SPICE_VDAGENT_DATA messages when there are no tokens. Maybe it's a server error like you say. What I saw is a separate case.

Regardless of whether there is an accounting error, we need to decide if we want to disconnect a client that violates the token scheme or not. I think disconnecting it makes sense but otherwise we just need to track the current message until it completes so as to not pass part of it accidentally when tokens arrive.

> 
> Regards,
> 
> Hans
Comment 5 Hans de Goede 2012-10-08 20:15:47 UTC
(In reply to comment #4)
> > This is not what I'm seeing, what I'm seeing is:
> > 1) repeated guest agent opening / closing of the virtio serial port, caused
> > by using an agent which does not yet have this fix:
> > http://cgit.freedesktop.org/spice/linux/vd_agent/commit/
> > ?id=9a58d8ee70c13677a1b62a2c8af694829c7afec5
> > 
> > 2) That triggering the race I described above (it could even be the agent
> > hello message which is triggering this).
> > 
> > After this has happened, simply re-connecting with remote-viewer works fine,
> > iow not the same as what you're seeing.
> 
> OK, I agree it is not what I am seeing. The "channel refused to allocate
> buffer" on main channel happens only for SPICE_VDAGENT_DATA messages when
> there are no tokens.

It does not only happen when there are no tokens, it also happens when there is
no agent, what we need to do is differentiate between the 2, and not disconnect
the client when it sends an agent message when there is no agent, assuming the
client has hit the described race.
Comment 6 Marc-Andre Lureau 2014-11-03 23:35:22 UTC
commit 655f8c440dbb57696aa8beec087f75d6748be11a
Author: Yonit Halperin <yhalperi@redhat.com>
Date:   Fri Nov 30 11:15:01 2012 -0500

    agent: fix mishandling of agent data received from the client after agent disconnection

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.