Changing the card profile via "set-card-profile" causes a crash in the PulseAudio daemon when the echo cancellation module is active for a running program with an open audio stream i.e. the program is started with:
If the echo cancellation filter is not requested, then the set-card-profile works fine.
Here is the upstream bug report that has all the abrt attachments: https://bugzilla.redhat.com/show_bug.cgi?id=1198661.
Fedora retrace: https://retrace.fedoraproject.org/faf/reports/586989/
Version-Release number of selected component:
cmdline: pulseaudio -D
runlevel: N 5
Thread no. 1 (10 frames)
#0 pa_atomic_load at pulsecore/atomic.h:57
#1 pa_asyncmsgq_send at pulsecore/asyncmsgq.c:143
#2 pa_sink_input_start_move at pulsecore/sink-input.c:1618
#3 pa_sink_input_move_to at pulsecore/sink-input.c:1891
#4 sink_unlink_hook_callback at modules/module-rescue-streams.c:107
#5 pa_hook_fire at pulsecore/hook-list.c:106
#6 pa_sink_unlink at pulsecore/sink.c:693
#7 sink_input_kill_cb at modules/echo-cancel/module-echo-cancel.c:1410
#8 pa_sink_move_all_finish at pulsecore/sink.c:924
#9 card_set_profile at modules/alsa/module-alsa-card.c:262
I'm marking this as a release blocker for now. It sounds like a reproducible crash, so it shouldn't be too hard to get fixed.
Thank you. Yes it is easily reproducible.
Created attachment 115945 [details] [review]
Could you check if the attached patch resolves the problem?
In case it does not, could you start PulseAudio with log-level debug (e g pulseaudio -vvvv), reproduce the bug with the log running, then attach the resulting log here? Thanks!
Created attachment 115957 [details]
Pulseaudio debug log.
(In reply to David Henningsson from comment #3)
> Created attachment 115945 [details] [review] [review]
> Could you check if the attached patch resolves the problem?
> In case it does not, could you start PulseAudio with log-level debug (e g
> pulseaudio -vvvv), reproduce the bug with the log running, then attach the
> resulting log here? Thanks!
No, pulseaudio is still crashing:
$ pulseaudio -vvvv > pulseaudio.log 2>&1
Segmentation fault (core dumped)
$ pactl set-card-profile alsa_card.pci-0000_00_1b.0 output:analog-surround-40+input:analog-stereo
Connection failure: Connection terminated
Debug log attached.
(In reply to Raman Gupta from comment #5)
> No, pulseaudio is still crashing:
Standby, I realized I didn't patch PA properly.
(In reply to Raman Gupta from comment #6)
> (In reply to Raman Gupta from comment #5)
> > No, pulseaudio is still crashing:
> Standby, I realized I didn't patch PA properly.
Nope, segmentation fault again, with the patch. Updated log will be attached.
Created attachment 115958 [details]
Created attachment 115968 [details] [review]
Fix two crashes
New day, new patch, I think this one is more likely to actually fix something...
I wasn't able to reproduce exactly your crash, but I got a similar one that think I found something that could cause it.
The patch fixes two different crashes
- one about setting asyncmsgq to NULL (bad idea, instead we just "park it somewhere")
- one about trying to set the name when impossible to do so accurately (this is the one I got)
Could you try it? And if it still crashes, check if the backtrace is the same or if it perhaps crashed for another reason this time.
(In reply to David Henningsson from comment #9)
> Created attachment 115968 [details] [review] [review]
> Fix two crashes
> New day, new patch, I think this one is more likely to actually fix
> I wasn't able to reproduce exactly your crash, but I got a similar one that
> think I found something that could cause it.
> The patch fixes two different crashes
> - one about setting asyncmsgq to NULL (bad idea, instead we just "park it
> - one about trying to set the name when impossible to do so accurately
> (this is the one I got)
> Could you try it? And if it still crashes, check if the backtrace is the
> same or if it perhaps crashed for another reason this time.
Thanks David for trying to solve this.
I tried the new patch (both with and without the earlier patch). In both cases, PA segfaulted again. The backtrace seems to be different than before though:
I will attach the abrt data directory for the crash.
Created attachment 115976 [details]
abrt dump for SEGFAULT with two_crashes.patch
Just to summarise the problem, here's what's happening:
0. We're running a playback and capture stream, which caused module-echo-cancel to be autoloaded
1. We disable the card profile on which the echo canceller is running, making the underlying sink go away
2. At the end of the profile change, the m-e-c sink-input gets killed (since it said it could not be moved, which is what we do in the autoload case)
3. This makes us unload the m-e-c module, at which time we try to move the original client stream away
4. At this point, there is no underlying sink for m-e-c, so the asyncmsgq is NULL, and any messages that might be sent during the move will cause a crash (we don't deal with a non-existent asyncmsgq, and trying to use something fake causes sink input state to go out of whack and crash in other places).
One way to fix this problem, is to not have the echo-cancel sink-input refuse to move (which we do when the module is autoloaded). Instead, we allow the move to happen, and then when it is completed, if we were autoloaded, we unload ourselves.
This means that we'll never be in an inconsistent state with respect to the existence of an underlying sink, which solves the problem at source. On the other hand, this feels a bit ugly as a solution.
I'll hack this up, but I'd like to hear thoughts on this approach if there are any.
Pushed a fix for this now, the way I described in comment #12.
Can you tell me which version this fix will be in? Thanks!
The fix is in 6.0.