Bug 95416

Summary: SIGSEGV - spice-server - on file transfer while killing the agent
Product: Spice Reporter: Victor Toso <bugzilla>
Component: serverAssignee: Spice Bug List <spice-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: bugzilla
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Victor Toso 2016-05-16 06:08:18 UTC
Not 100% reproducible here. I had to kill the agent and start the file transfer three or four times to make it crash.

Program received signal SIGSEGV, Segmentation fault.
0x00007f7bd3e5d838 in reds_get_agent_data_buffer (reds=<optimized out>, mcc=<optimized out>, size=32) at reds.c:1121
1121        return dev->priv->recv_from_client_buf->buf + sizeof(VDIChunkHeader);
(gdb) bt
#0  0x00007f7bd3e5d838 in reds_get_agent_data_buffer (reds=<optimized out>, mcc=<optimized out>, size=32) at reds.c:1121
#1  0x00007f7bd3e37c96 in red_channel_client_receive (handler=0x55891df6a110, stream=0x55891ca9e300) at red-channel.c:269
#2  0x00007f7bd3e37c96 in red_channel_client_receive (rcc=rcc@entry=0x55891df66000) at red-channel.c:324
#3  0x00007f7bd3e3a3dc in red_channel_client_event (fd=<optimized out>, event=1, data=0x55891df66000) at red-channel.c:1584
#4  0x000055891a4c0f96 in qemu_iohandler_poll ()
#5  0x000055891a4c0bb1 in main_loop_wait ()
#6  0x000055891a24a064 in main ()
Comment 1 Victor Toso 2016-05-16 06:18:42 UTC
Interesting part of the log is

Spice-INFO: reds.c:3214:spice_server_char_device_add_interface: CHAR_DEVICE vdagent
(process:17063): Spice-DEBUG: char-device.c:694:red_char_device_reset_dev_instance: sin 0x5598a9458d58, char device 0x5598a945d150
(process:17063): Spice-DEBUG: char-device.c:810:red_char_device_start: char device 0x5598a945d150
main_channel_handle_parsed: agent start

(...)

__red_char_device_write_buffer_get: token violation: dev 0x5598a945d150 client 0x5598a949f780
red_char_device_handle_client_overflow: dev 0x5598a945d150 client 0x5598a944c3a0
Comment 2 Victor Toso 2016-05-16 06:32:13 UTC
spice is latest master, qemu is not (fedora 23)
Comment 3 Fabiano FidĂȘncio 2016-05-16 06:49:52 UTC
(In reply to Victor Toso from comment #2)
> spice is latest master, qemu is not (fedora 23)

As things are changing fast on spice side, let's just be a little bit more specific here:
spice: 2c3fc80e518
qemu: qemu-2.4.1-9.fc23
Comment 4 Victor Toso 2016-05-16 10:59:51 UTC
Sorry for the lack of information.

* Guest was Fedora 23
* Not 100% reproducible but quite often

Steps:
1) Connect to the Guest
2) Start a file transfer (I often do with a ISO so I have time to kill the agent)
3) Kill the spice-vdagentd

spice-server might crash as soon as you kill the agent.
Comment 5 Frediano Ziglio 2016-05-19 13:52:10 UTC
Calls to red_char_device_write_buffer_get can return NULL (this is confirmed by the "token violation: dev 0x5598a945d150 client 0x5598a949f780").
Not all paths in spice-server check this.
This in the agent is due to recent changes when agent is shutdown not resetting the CharDevice (previously destroyed).

Discussed with Uri about doing a "soft reset" instead of not resetting the state at all.
Comment 6 Frediano Ziglio 2016-05-31 08:58:23 UTC
Note that red_char_device_write_buffer_get can return NULL in case of flow control enabled. For the agent (see red_char_device_client_add calls in reds.c) flow control is enabled so this probably is not a regression but more a path more stressed (so some change that make this issue more probable).
Comment 7 Frediano Ziglio 2016-05-31 10:29:47 UTC
Posted a patch at https://lists.freedesktop.org/archives/spice-devel/2016-May/029730.html
Comment 8 Frediano Ziglio 2016-05-31 15:44:17 UTC
I merged the patch and now does not crash anymore.
However I think would be worth checking if this problem was introduced (a regression) or present even on stable version. Steps I used:
- client fedora 24 with master (5d2fb6a89745767ad22ec60d4aa099e2301ca606)
- guest rhel7
- graphical login into guest
- start transferring
- kill spice-vdagend (I used a ssh root console)
- attempt to copy file again (daemon still stopped, same logged in graphic terminal)
Comment 9 Frediano Ziglio 2016-05-31 16:26:28 UTC
Problem introduced in 0.13.1, patch 1cec1c5118b65124de6bc6f984f376ff4e297bfb.
Was possible even before using old spice-gtk clients which do not implement AGENT_CONNECTED_TOKENS message. The CharDevice lifespan was changed so the error in current code, with old code the char device was destroyed when daemon was killed so the code was never reaching the path which crashed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.