Bug 98617

Summary: unloading module-remap-sink while module-combine-sink is using it causes Assertion '*_head == _item' failed at modules/module-combine-sink.c:818, function output_remove_within_thread().
Product: PulseAudio Reporter: Julius Michaelis <freebugs>
Component: modulesAssignee: pulseaudio-bugs
Status: RESOLVED FIXED QA Contact: pulseaudio-bugs
Severity: normal    
Priority: medium CC: lennart
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 96750    

Description Julius Michaelis 2016-11-06 17:58:57 UTC
Reproduction:
Bug can be invoked by loading a (one) remap sink on an actual sink, and then loading a combine sink based on that, and then unloading the remap sink, e.g.:

UDJ6=$(insert some actual sink here)
remaps=$(pactl load-module module-remap-sink sink_name=udj6-3 sink_properties=device.description='UDJ6-3' remix=no master="$UDJ6" channels=2 master_channel_map=front-center,lfe channel_map=front-left,front-right)
pactl load-module module-combine-sink sink_name=udj6 sink_properties=device.description='UDJ6' slaves=udj6-3
pactl unload-module $remaps
# there may be other ways

Version: 9.0 (compiled both in release-mode as per gentoo's portage, and in debug mode as per README)

Stack trace:
#0  0x00007ff6c363c118 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ff6c363d56a in __GI_abort () at abort.c:89
#2  0x00007ff6a23b7829 in output_remove_within_thread (o=0x2078b60) at modules/module-combine-sink.c:818
#3  0x00007ff6a23b7c2a in sink_process_msg (o=0x2082630, code=26, data=0x2078b60, offset=0, chunk=0x0) at modules/module-combine-sink.c:901
#4  0x00007ff6c6d8402a in pa_asyncmsgq_dispatch (object=0x2082630, code=26, userdata=0x2078b60, offset=0, memchunk=0x7ff6a23afd40) at pulsecore/asyncmsgq.c:318
#5  0x00007ff6c6daba49 in asyncmsgq_read_work (i=0x203c510) at pulsecore/rtpoll.c:566
#6  0x00007ff6c6daa775 in pa_rtpoll_run (p=0x203d510) at pulsecore/rtpoll.c:236
#7  0x00007ff6a23b5971 in thread_func (userdata=0x20468c0) at modules/module-combine-sink.c:354
#8  0x00007ff6c644f3d8 in internal_thread_func (userdata=0x20352c0) at pulsecore/thread-posix.c:81
#9  0x00007ff6c40b0394 in start_thread (arg=0x7ff6a23b0700) at pthread_create.c:333
#10 0x00007ff6c36f0eed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Background info:
I wanted to make some remap sinks automatically disappear when unplugging a USB sink, so I added
diff -aur pulseaudio-9.0-orig/src/modules/module-remap-sink.c pulseaudio-9.0/src/modules/module-remap-sink.c
--- pulseaudio-9.0-orig/src/modules/module-remap-sink.c	2016-05-03 08:17:39.000000000 +0200
+++ pulseaudio-9.0/src/modules/module-remap-sink.c	2016-11-06 17:28:37.196938338 +0100
@@ -423,7 +423,7 @@
     pa_proplist_sets(sink_input_data.proplist, PA_PROP_MEDIA_ROLE, "filter");
     pa_sink_input_new_data_set_sample_spec(&sink_input_data, &ss);
     pa_sink_input_new_data_set_channel_map(&sink_input_data, &stream_map);
-    sink_input_data.flags = (remix ? 0 : PA_SINK_INPUT_NO_REMIX);
+    sink_input_data.flags = (remix ? 0 : PA_SINK_INPUT_NO_REMIX) | PA_SINK_INPUT_DONT_MOVE;
     sink_input_data.resample_method = resample_method;
(The bug is invokable without this.)
That seems to work nicely, but in turn the combine sink I have on top of my three remap sinks fails. I added some debug output to output_remove_within_thread:
    pa_log("outputs list: around %p", o);
    PA_LLIST_FOREACH(it, o->userdata->thread_info.active_outputs)
        pa_log("p%p <- p%p -> p%p",it->prev,it,it->next);
Result:
E: [combine] module-combine-sink.c: outputs list: around 0x2078b60
E: [combine] module-combine-sink.c: p(nil) <- p0x20711d0 -> p0x20788a0
E: [combine] module-combine-sink.c: p0x20711d0 <- p0x20788a0 -> p0x2078b60
E: [combine] module-combine-sink.c: p0x20788a0 <- p0x2078b60 -> p(nil)
E: [combine] module-combine-sink.c: outputs list: around 0x2078b60
E: [combine] module-combine-sink.c: p(nil) <- p0x20711d0 -> p0x20788a0
E: [combine] module-combine-sink.c: p0x20711d0 <- p0x20788a0 -> p(nil)
E: [combine] module-combine-sink.c: Assertion '*_head == _item' failed at modules/module-combine-sink.c:818, function output_remove_within_thread().

If I check whether the module is in the list and only remove it if so, three outputs are removed twice. When attempting to play something again, the output has significant crackling in it.
Comment 1 Tanu Kaskinen 2016-11-06 18:15:06 UTC
Since this is a crash bug, I'm marking this as a release blocker.
Comment 2 Tanu Kaskinen 2016-12-07 14:55:51 UTC
I was able to reproduce the crash with the provided script. In case someone else tries to reproduce this too, note that I first failed to reproduce when I issued the commands manually, so it's best to use the script. I would guess the manual commands failed to cause a crash, because the sinks had time to suspend before running the unload-module command.

Here are my investigation results so far:

When module-remap-sink is unloaded, it has to remove the virtual sink input that connects it to the master sink. When the "detach" callback of the sink input is called, module-remap-sink propagates the detach call to all sink inputs connected to it, so all sink inputs connected to the sink tree get detached. Then later, when module-combine-sink removes its virtual sink input, the detach callback gets called again. module-combine-sink assumes that the detach callback will be called only once, and the result of an incorrect assumption is a crash.

I think it's reasonable to assume that the detach callback won't be called for streams that are already detached. I'll try if skipping the detach callback for already-detached sink inputs works.
Comment 3 Tanu Kaskinen 2016-12-07 17:45:17 UTC
The fix works.

While thinking about the problem, however, I realized that this situation with double-detaching should ideally not happen. I think the combine sink should remove its connection to the remap sink before the remap sink gets detached. Removing the combine sink's input involves sending messages to the root sink's IO thread, and doing that after the remap sink has detached feels unsafe and wrong. I'll submit the patch anyway - fixing the message sending problem would need substantial changes to all filter sinks, and I don't want to do that now.
Comment 5 Tanu Kaskinen 2016-12-19 23:43:52 UTC
I committed the patches now.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.