spice-gtk version spice-gtk-0.11 usb-redir version 0.4.3 It can be reproduce,I plug and unplug about 30 times,and this coredump occured. I guest it crash when logging below is the coredump info. if you use export SPICE_DEBUG=1, it will crash more frequently. GSpice-Message: main channel: opened going to redirect usb device 2 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: resetting device: -5 libusb:error [_open_sysfs_attr] open /sys/bus/usb/devices/1-3/descriptors failed ret=-1 errno=2 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: resetting device: -5 libusb:error [_open_sysfs_attr] open /sys/bus/usb/devices/2-3/descriptors failed ret=-1 errno=2 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: resetting device: -5 libusb:error [_open_sysfs_attr] open /sys/bus/usb/devices/2-3/descriptors failed ret=-1 errno=2 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: submitting bulk transfer on ep 02: -4 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: submitting bulk transfer on ep 02: -4 (noc-client:1916): GSpice-CRITICAL **: usbredirhost error: submitting bulk transfer on ep 81: -4 Segmentation fault (core dumped) noc@ubuntu:~$ gdb /usr/local/bin/noc-client core GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://bugs.launchpad.net/gdb-linaro/>... Reading symbols from /usr/local/bin/noc-client...done. [New LWP 1916] [New LWP 1995] warning: Can't read pathname for load map: Input/output error. [Thread debugging using libthread_db enabled] Core was generated by `spicy -h 172.16.0.7 -p 6100'. Program terminated with signal 11, Segmentation fault. #0 usbredirhost_bulk_packet_complete (libusb_transfer=0x2248880) at usbredirhost.c:1934 1934 usbredirhost.c: No such file or directory. in usbredirhost.c (gdb) bt #0 usbredirhost_bulk_packet_complete (libusb_transfer=0x2248880) at usbredirhost.c:1934 #1 0x00007fec7e32343a in usbredirparser_call_type_func (parser_pub=0x22492d0) at usbredirparser.c:673 #2 usbredirparser_do_read (parser_pub=0x22492d0) at usbredirparser.c:777 #3 0x0000000000439691 in usbredir_handle_msg (c=<optimized out>, in=0x25044a0) at channel-usbredir.c:609 #4 0x000000000042aa04 in spice_channel_recv_msg (channel=0x2248880, msg_handler=0x4390f0 <spice_usbredir_handle_msg>, data=0x0) at spice-channel.c:1835 #5 0x000000000042b451 in spice_channel_iterate_read (channel=0x2248880) at spice-channel.c:1997 #6 spice_channel_iterate_read (channel=0x2248880) at spice-channel.c:1981 #7 0x00000000004291d4 in spice_channel_iterate (channel=0x2248880) at spice-channel.c:2055 #8 spice_channel_coroutine (data=0x2248880) at spice-channel.c:2208 #9 0x00000000004bc65b in coroutine_trampoline (cc=<optimized out>) at coroutine_ucontext.c:56 #10 0x00000000004c8163 in continuation_trampoline (i0=<optimized out>, i1=<optimized out>) at continuation.c:49 #11 0x00007fec7bdaf180 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #12 0x0000000002248d00 in ?? () #13 0x0000000000000000 in ?? () (gdb) p host $1 = <optimized out> (gdb) p host->log_func value has been optimized out (gdb) p host->lock value has been optimized out (gdb) p libusb_transfer->buffer $2 = (unsigned char *) 0x0 (gdb) p libusb_transfer->actual_length $3 = 0 I dont know why the pointer libusb_transfer->buffer is zero, I saw hot->log_func is 0 at previous core dump.
I rename spicy to noc-client
core diffs from time to time (gdb) bt #0 usbredirhost_handle_disconnect (host=0x0) at usbredirhost.c:239 #1 0x00007ffff64792d5 in libusb_status_or_error_to_redir_status (host=<optimized out>, status=<optimized out>) at usbredirhost.c:284 #2 0x00007ffff647a354 in usbredirhost_bulk_packet_complete (libusb_transfer=0x7fffd80010d8) at usbredirhost.c:1950 #3 0x00007ffff606443a in usbredirparser_call_type_func (parser_pub=0x798830) at usbredirparser.c:673 #4 usbredirparser_do_read (parser_pub=0x798830) at usbredirparser.c:777 #5 0x0000000000439691 in usbredir_handle_msg (c=<optimized out>, in=0x7fffd8001160) at channel-usbredir.c:609 #6 0x000000000042aa04 in spice_channel_recv_msg (channel=0x7968d0, msg_handler=0x4390f0 <spice_usbredir_handle_msg>, data=0x0) at spice-channel.c:1835 #7 0x000000000042b451 in spice_channel_iterate_read (channel=0x7968d0) at spice-channel.c:1997 #8 spice_channel_iterate_read (channel=0x7968d0) at spice-channel.c:1981 #9 0x00000000004291d4 in spice_channel_iterate (channel=0x7968d0) at spice-channel.c:2055 #10 spice_channel_coroutine (data=0x7968d0) at spice-channel.c:2208 #11 0x00000000004bc65b in coroutine_trampoline (cc=<optimized out>) at coroutine_ucontext.c:56 #12 0x00000000004c8163 in continuation_trampoline (i0=<optimized out>, i1=<optimized out>) at continuation.c:49 #13 0x00007ffff3af0180 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #14 0x0000000000796d50 in ?? () #15 0x0000000000000000 in ?? () (gdb) p host->disconnect_lock Cannot access memory at address 0x10
Hi, Hmm, it seems as if sometimes some parts of usbredir internals get overwritten with zeros. Could be a use after free scenario, but you're the only one reporting this! Which distro are you using? And which version of spice-gtk, libusb and usbredir do you have ? Regards, Hans
Created attachment 68294 [details] [review] Potential fix I hit a similar issue as yours today, and I've created a patch for this. Hopefully this also fixes your issue, can you give the attached patch a try please?
Thanks a lot, but it still happens, I know your patch is for the latest code from the git, newer than 0.14, I update spice-gtk and patched, and I still use usb-redir-0.4.3, need I update usb-redir to 0.5.2 and qemu to master branch to fix this problem? now the coredump info is like below Core was generated by `noc-client -h 192.168.11.185 -p 6100'. Program terminated with signal 6, Aborted. #0 0x00007fa13e2b63a5 in ?? () (gdb) bt #0 0x00007fa13e2b63a5 in ?? () #1 0x00007fa13e2b9b0b in ?? () #2 0x0000000000000000 in ?? () it's appear that the stack has been overwrited.
(In reply to comment #5) > Thanks a lot, but it still happens, I know your patch is for the latest code > from the git, newer than 0.14, I update spice-gtk and patched, and I still > use usb-redir-0.4.3, need I update usb-redir to 0.5.2 and qemu to master > branch to fix this problem? You could try updating usb-redir to 0.4.4, but I don't think that will fix it (there is one possible fix in there which might help, but it is unlikely). A better bet would be to update your libusb to the latest libusbx release: http://sourceforge.net/projects/libusbx/files/releases/1.0.14/ Note that the libusbx-1.0.x releases are drop in bugfix releases for libusb-1.0, A group of people (including me) forked libusb as the original upstream was failing to do releases for a period of over 2 years. You could try building the latest qemu, in that case please use this git tree: http://cgit.freedesktop.org/~jwrdegoede/qemu/log/?h=qemu-kvm-1.2-usbredir And it that case you do need usbredir-0.5.x (where as if you stay with an older qemu you need 0.4.x) But I don't think updating qemu is useful since this clearly is a bug on the client side, not the host.
still coredump libusb-1.0.14 usbredir 0.5.2 spice-gtk, patched version can you tell me what version you are using? noc@ubuntu:~$ gdb /usr/local/bin/spicy core GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://bugs.launchpad.net/gdb-linaro/>... Reading symbols from /usr/local/bin/spicy...done. [New LWP 4006] [New LWP 3958] warning: Can't read pathname for load map: Input/output error. [Thread debugging using libthread_db enabled] Core was generated by `spicy -h 192.168.23.14 -p 6100'. Program terminated with signal 11, Segmentation fault. #0 list_del (entry=0x125d048) at libusbi.h:112 112 entry->next->prev = entry->prev; (gdb) bt #0 list_del (entry=0x125d048) at libusbi.h:112 #1 usbi_handle_transfer_completion (itransfer=<optimized out>, status=LIBUSB_TRANSFER_NO_DEVICE) at io.c:1460 #2 0x00007f86dbc2497c in usbi_handle_disconnect (handle=0x129eec0) at io.c:2447 #3 0x00007f86dbc269c5 in op_handle_events (ctx=0xf9edf0, fds=0x7f86c8002e90, nfds=3, num_ready=0) at os/linux_usbfs.c:2490 #4 0x00007f86dbc235dc in handle_events (ctx=0xf9edf0, tv=<optimized out>) at io.c:1940 #5 0x00007f86dbc241d7 in libusb_handle_events_timeout_completed (ctx=0xf9edf0, tv=<optimized out>, completed=0x0) at io.c:2020 #6 0x00007f86dbc242cf in libusb_handle_events (ctx=<optimized out>) at io.c:2096 #7 0x00007f86df30ff50 in spice_usb_device_manager_usb_ev_thread (user_data=<optimized out>) at usb-device-manager.c:867 #8 0x00007f86ddab02b6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #9 0x00007f86dd5acefc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #10 0x00007f86dd2e759d in clone () from /lib/x86_64-linux-gnu/libc.so.6 #11 0x0000000000000000 in ?? () p entry->prev $1 = (struct list_head *)0x0
Hi, You've hit the jackpot with that last backtrace, you seem to be hitting a bug in libusb's disconnection handling. I need to make some time to think about how to best fix this, but in the mean time, can you please try editing libusb/os/linux_usbfs.c, around line 2490 you should have the following code block: if (pollfd->revents & POLLERR) { usbi_remove_pollfd(HANDLE_CTX(handle), hpriv->fd); usbi_handle_disconnect(handle); continue; } If I'm right simply removing that entire code block, and then doing a make; make install, should fix the issue you are seeing, that might in the end also be the best fix, but as said I need to investigate this a bit more. Regards, Hans
yes, It did work, no crash was found during plug and unplug, but cause spicy hang, I have to kill spicy.
Hi, I've not forgotten about this, I've merely been busy with other stuff. So I've spend today reading all your backtraces and I believe I know what is going now, here is a rather detailed step by step of the race I believe you're hitting: 1) app calls libusb_submit_transfer 2) libusb_submit_transfer locks itransfer->lock 3) libusb_submit_transfer adds the transfer to the flying list 4) *thread switch* 5) other thread notices POLL_ERR on device fd, calls usbi_handle_disconnect 6) usbi_handle_disconnect find the transfer which is in progress of being submitted in the flying list 7) usbi_handle_disconnect calls usbi_backend->clear_transfer_priv on the transfer, this blocks waiting on itransfer->lock (this makes the race more likely to trigger) 8) *thread switch* 9) libusb_submit_transfer actually tries to submit the transfer now, calls usbi_backend->submit_transfer, which fails with -ENODEV 10) libusb_submit_transfer *removes* the transfer from the flying list, unlocks itransfer->lock and returns an error to its caller 11) the caller frees the transfer, meaning the to_cancel pointer in usbi_handle_disconnect now points to free-ed memory, for extra mayhem 12) *thread switch* 13) usbi_handle_disconnect calls usbi_handle_transfer_completion 14) usbi_handle_transfer_completion tries to remove the transfer from the flying list *for the 2nd time* But the first call done from libusb_submit_transfer has already done this. list_del looks like this: static inline void list_del(struct list_head *entry) { entry->next->prev = entry->prev; entry->prev->next = entry->next; entry->next = entry->prev = NULL; } So the first call sets it next and prev to NULL, and then the 2nd call tries to deref next -> BOOM For an example of this, see: https://bugs.freedesktop.org/show_bug.cgi?id=55619#c7 So this clearly makes this a libusb issue. I've a plan to fix this, which I hope to finish tonight and then give you a patch to test. Regards, Hans
Created attachment 78821 [details] [review] libusbx patch: usbi_handle_disconnect: Fix race condition Hi, Here is a patch for libusbx which should fix this, can you try building a recent libusbx with this patch and see if it fixes things? Thanks, Hans
(In reply to comment #11) > Created attachment 78821 [details] [review] [review] > libusbx patch: usbi_handle_disconnect: Fix race condition > > Hi, > > Here is a patch for libusbx which should fix this, can you try building a > recent libusbx with this patch and see if it fixes things? > > Thanks, > > Hans Thanks a lot, I'll try it soon.
clydezhou, any update? Hans, is the fix still required?
(In reply to Marc-Andre Lureau from comment #13) > clydezhou, any update? > > Hans, is the fix still required? My reponsibility shifts to another department, I do not need this fix anymore, but I think it's a improvement of user experience and the quality of whole spice-project, thank you very much
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.