Bug 93443

Summary: [abrt] pulseaudio: pa_sink_input_set_requested_latency_within_thread(): pulseaudio killed by SIGSEGV
Product: PulseAudio Reporter: Raman Gupta <rocketraman>
Component: daemonAssignee: pulseaudio-bugs
Status: RESOLVED FIXED QA Contact: pulseaudio-bugs
Severity: normal    
Priority: medium CC: lennart, rdieter
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 93823    
Attachments: default.pa
Output of pactl list

Description Raman Gupta 2015-12-18 22:02:33 UTC
Changing the card profile via "set-card-profile" causes a crash in the PulseAudio daemon. It does not happen every time, but I can generally repeat it by rebooting and then attempting the set-card-profile again.

Maybe it is related to the changes required to fix  Bug #90416?

Here is the upstream bug report that has all the abrt attachments: https://bugzilla.redhat.com/show_bug.cgi?id=1291954.

Fedora retrace: https://retrace.fedoraproject.org/faf/reports/933887/

Version-Release number of selected component:
pulseaudio-7.1-1.fc23

Additional info:
reporter:       libreport-2.6.3
backtrace_rating: 4
cmdline:        /usr/bin/pulseaudio --start --log-target=syslog
crash_function: pa_sink_input_set_requested_latency_within_thread
executable:     /usr/bin/pulseaudio
global_pid:     30944
kernel:         4.2.6-301.fc23.x86_64
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (7 frames)
 #0 pa_sink_input_set_requested_latency_within_thread at pulsecore/sink-input.c:1172
 #1 pa_sink_invalidate_requested_latency at pulsecore/sink.c:3089
 #2 pa_sink_process_msg at pulsecore/sink.c:2621
 #3 asyncmsgq_read_work at pulsecore/rtpoll.c:566
 #4 pa_rtpoll_run at pulsecore/rtpoll.c:236
 #5 thread_func at modules/alsa/alsa-sink.c:1798
 #6 internal_thread_func at pulsecore/thread-posix.c:81
Comment 1 Tanu Kaskinen 2015-12-19 01:55:24 UTC
You don't seem to be using the standard configuration. Could you attach your default.pa? And could you also attach "pactl list" output, so I see what your hardware configuration is? I'll try to reproduce once I have that information.
Comment 2 Tanu Kaskinen 2015-12-19 02:00:57 UTC
Also, there was at least one stream playing when the crash happened - do you remember what application that might have been?
Comment 3 Raman Gupta 2015-12-19 06:11:26 UTC
Created attachment 120591 [details]
default.pa

Attached is my default.pa file. The only change from the Fedora rpm version is that I disabled "module-switch-on-port-available".
Comment 4 Raman Gupta 2015-12-19 06:12:51 UTC
Created attachment 120592 [details]
Output of pactl list

Output of "pactl list" attached.
Comment 5 Raman Gupta 2015-12-19 06:20:29 UTC
There were two output streams open:

1) An HTML5 YouTube video in Chrome (https://gopro.com/channel/video-of-the-day/frozen-kitten-lives).

2) An echo-cancelled output stream open by Zoiper (open but not active).

One input stream open:

1) A microphone stream opened by Zoiper.

The problem does not happen all the time. It seems to only happen the first time I try it after restarting the computer. Every time after that (so far) it has been fine. Strange!
Comment 6 Tanu Kaskinen 2015-12-19 08:16:18 UTC
Oh, I assumed you had added module-device-manager to default.pa, since it's not there by default and you had it loaded, but I forgot that it's loaded by /usr/bin/start-pulseaudio-x11 when you start a KDE session. The reason why you see the crashes only after booting is that after pulseaudio has crashed and restarted, module-device-manager isn't loaded any more, and module-device-manager is somehow involved in this (I don't know yet if the root cause is in that module, though).

I'm able to reproduce a crash. It's a different assertion, however: E: [alsa-sink-CX20561 Analog] sink.h: Assertion 'pa_object_refcnt(pa_object_cast(o)) > 0' failed at ./pulsecore/sink.h:308, function pa_sink_assert_ref(). Aborting.

I don't use KDE and I use pacat/parec to simulate Zoiper. These are the steps that I use to reproduce the crash (I don't know if all of them are necessary):

1) pactl unload-module module-switch-on-port-available

2) pactl load-module module-device-manager do_routing=true

3) pactl set-card-profile alsa_card.pci-0000_00_1b.0 output:analog-surround-40+input:analog-stereo

4) PULSE_PROP="filter.want=echo-cancel media.role=phone" pacat /dev/zero (leave running)

5) PULSE_PROP="filter.want=echo-cancel media.role=phone" parec > /dev/null (leave running)

6) pactl set-card-profile alsa_card.pci-0000_00_1b.0 output:analog-stereo+input:analog-stereo

I'll continue the investigation.
Comment 7 Tanu Kaskinen 2015-12-19 08:20:07 UTC
Oh, one more note if someone else wants to try to reproduce: it's probably important to have only one sink in the system, because the backtrace shows module-null-sink getting loaded when switching the alsa card profile (the profile switch causes the surround sink to get unloaded, at which point module-always-sink loads the null sink).
Comment 8 Tanu Kaskinen 2015-12-19 11:06:22 UTC
Ok, the problem is that once the old alsa sink is gone and the new alsa sink isn't yet created, the echo-cancel sink isn't connected anywhere. module-device-manager changes the routing of the pacat stream (which is connected to the echo-cancel sink) during the profile change. I guess it tries to move pacat from the echo-cancel sink to the null sink. That doesn't seem very smart, but I thought that in principle we should support that. However, it turned out that making it work is pretty much impossible with the current architecture.

So, our architecture doesn't support rerouting streams on top of virtual sinks that themselves are being moved. There doesn't seem to be any easy way to change that. I'll try a different approach next: rerouting pacat in this case is not useful, so there should be some way to prevent that from happening.
Comment 9 Raman Gupta 2015-12-19 16:44:41 UTC
Awesome, thank you for the investigation and the findings! I find this ability of PulseAudio to switch card profiles at runtime is awesome -- I use it to switch audio between speakers and headphones all the time. I can see why it is difficult to get right though. Let me know if I can help in any way.
Comment 10 Tanu Kaskinen 2015-12-20 10:28:07 UTC
I sent a patch to the mailing list. You can download it here if you want to try it out: https://patchwork.freedesktop.org/patch/68741/mbox/

The patch gets rid of the crash, but the profile change has the side effect of unloading the echo-cancel module and killing the Zoiper playback stream. If that's a problem for you, please file a new bug. I don't promise to work on that bug, though, due to other tasks that I consider higher priority.
Comment 11 Raman Gupta 2015-12-21 15:41:43 UTC
The unloading is problematic because neither Zoiper nor Skype nor Chrome deal well with a stream being killed from under them. When Pulse crashes, both Zoiper and Skype have to be "kill -9"ed and restarted, and Chrome deals with it slightly better and does not have to be killed, but it still loses access to the mic until it is restarted.

I suspect the behavior of those applications will be the same if pulse does not crash but their streams are killed. This effectively means the filter is unusable.

From my perspective, a crash is actually preferable because then at least everything works from that point forward.
Comment 12 Tanu Kaskinen 2015-12-21 21:46:09 UTC
I believe it's pretty easy to make the streams not die, and I plan to write a patch for that, but avoiding the loss of echo cancelling is probably more tricky.
Comment 13 Tanu Kaskinen 2016-01-14 07:33:37 UTC
My patch has a side effect of making all streams connected to any filter sink die when the sink is unloaded, so the patch is not suitable for applying in its current form. It looks like a better patch won't appear before 8.0, so I'm removing the release blocker status from this bug.

I'm still working on a different fix (and I'll also try to improve my previous patch, because the idea is fine in principle). AFAIK Arun is also working on a different fix.
Comment 14 Tanu Kaskinen 2016-01-22 15:11:28 UTC
Marking as 9.0 release blocker.
Comment 15 Tanu Kaskinen 2016-03-22 13:49:52 UTC
I submitted a new fix:

1/7: https://patchwork.freedesktop.org/patch/77861/
2/7: https://patchwork.freedesktop.org/patch/77862/
3/7: https://patchwork.freedesktop.org/patch/77863/
4/7: https://patchwork.freedesktop.org/patch/77864/
5/7: https://patchwork.freedesktop.org/patch/77865/
6/7: https://patchwork.freedesktop.org/patch/77866/
7/7: https://patchwork.freedesktop.org/patch/77867/

If someone wants to test the patches, it's sufficient to take just the first three patches. The last four patches don't have any significant effect on behaviour.

The earlier patch caused the Zoiper stream to get killed; these patches shouldn't do that.
Comment 16 Raman Gupta 2016-04-09 17:34:39 UTC
(In reply to Tanu Kaskinen from comment #15)
> I submitted a new fix:
> 
> 1/7: https://patchwork.freedesktop.org/patch/77861/
> 2/7: https://patchwork.freedesktop.org/patch/77862/
> 3/7: https://patchwork.freedesktop.org/patch/77863/
> 4/7: https://patchwork.freedesktop.org/patch/77864/
> 5/7: https://patchwork.freedesktop.org/patch/77865/
> 6/7: https://patchwork.freedesktop.org/patch/77866/
> 7/7: https://patchwork.freedesktop.org/patch/77867/
> 
> If someone wants to test the patches, it's sufficient to take just the first
> three patches. The last four patches don't have any significant effect on
> behaviour.
> 
> The earlier patch caused the Zoiper stream to get killed; these patches
> shouldn't do that.

I tested the first three patches on PulseAudio 7.1 (Fedora 23), and they work! I configured Zoiper to use the echo canceling module via PULSE_PROP, and with the first three patches above, the Zoiper audio stream continues perfectly when changing the card profile. I don't notice any negative effects of applying the patch.
Comment 17 Tanu Kaskinen 2016-04-25 13:02:03 UTC
The patches were applied today, but soon after Arun realized that the patches prevent module-device-manager from doing legitimate stream moves. For example, a voip call might be ongoing with integrated speakers and mic, and then the user plugs in a USB headset. If the headset has higher priority in module-device-manager's priority list, my patches prevent m-d-m from moving the voip streams.

I'll try to refine the logic in m-d-m to fix that problem sometime soonish. After discussing with Arun, I think the logic should be amended so that in addition to checking whether the module-filter-apply.filter_device property is set on a stream, m-d-m should also check if the move target is the master device of the filter device. If it's not, the move should be allowed.

I'm currently working on bug 93259, though. If someone else wants to take this bug in the mean time, please leave a note here.
Comment 18 Arun Raghavan 2016-04-25 15:00:37 UTC
Is there any case (autoloaded filters or otherwise) where m-d-m should be trying to move a stream from a filter to a parent of that filter?
Comment 19 Tanu Kaskinen 2016-04-25 15:06:49 UTC
If the filter is manually loaded, the user may configure the filter sink as the highest-priority sink. If the user then later configures the master sink as the highest-priority sink, module-device-manager should move the application stream from the filter sink to the master sink.
Comment 20 Arun Raghavan 2016-05-06 05:53:28 UTC
Okay, so trying to specify exactly the behaviour we want: for autoloaded filters, we want module-device-manager to not try to move streams within the same filter hierarchy.

Does that sound comprehensive? If it does, I'll write this logic.
Comment 21 Tanu Kaskinen 2016-05-06 08:08:44 UTC
Yes, that sounds good.
Comment 22 Arun Raghavan 2016-05-06 08:46:51 UTC
I've sent a patch to fix the last of the problems mentioned here at:

  https://patchwork.freedesktop.org/series/6811/

It would be great if someone who uses KDE and/or module-device-manager could test things to make sure everything continues to work as expected.
Comment 23 Arun Raghavan 2016-05-07 08:00:52 UTC
I've pushed these changes out now. Request for testing still holds.
Comment 24 Raman Gupta 2017-10-17 05:36:39 UTC
I think this was working fine for quite some time, but seems to now be broken again -- I believe it has been broken for a while (perhaps PA 9?) but I haven't had time to report it as an issue. I am currently on PA 10.0 on Fedora 26.

Basically, PA no longer crashes when changing the card profile, however many applications are not able to deal with the profile change. Some examples:

1) Chrome can play audio after the profile change, but no longer can record anything until after a restart.

2) If any audio-related work is done, Zoiper freezes and must be killed before it will work again.
Comment 25 Georg Chini 2017-10-18 06:27:11 UTC
Have you tried PA 11.1? The bug should be fixed in that version.
Comment 26 Raman Gupta 2017-10-23 05:49:33 UTC
(In reply to Georg Chini from comment #25)
> Have you tried PA 11.1? The bug should be fixed in that version.

I tried it with PA 11.1 on Fedora 26 (pulseaudio-11.1-2.fc26.src.rpm). It does seem to work better, but interestingly, the first time I switched from headset to speakers, PA segfaulted:

[33530.878079] pulseaudio[3313]: segfault at 80 ip 00007fab81b96073 sp 00007ffed422e600 error 4 in module-echo-cancel.so[7fab81b91000+12000]

However, for some reason, this only happened the first time -- subsequent times the switch happened correctly.

I have an abrt crash capture if it could help.
Comment 27 Raman Gupta 2017-10-23 07:27:11 UTC
(In reply to Raman Gupta from comment #26)
> However, for some reason, this only happened the first time -- subsequent
> times the switch happened correctly.

I can reproduce the crash reliably by restarting Chrome (Stable 62, 64-bit), playing a YouTube video, and then switching profiles from headset to speakers. The command that causes the crash is:

pactl set-card-profile alsa_card.pci-0000_00_1b.0 output:analog-surround-40+input:analog-stereo

This causes the crash.

At this point Chrome continues to output audio correctly, but loses access to the mic. Subsequent switches between headset and speakers work fine, but Chrome's access to the mic remains broken.

Once Chrome is restarted mic access is restored, but the crash can easily be reproduced again.

Another interesting thing is that a switch in the opposite direction -- from speakers to headset -- does not cause a crash.
Comment 28 Tanu Kaskinen 2017-10-24 13:16:13 UTC
Can you file a new bug for the new crash?

Information that I'd like to see attached: backtrace (maybe abrt provides that) and "pactl list" output just before the set-card-profile command.
Comment 29 Raman Gupta 2017-10-31 17:19:34 UTC
New bug filed here: https://bugs.freedesktop.org/show_bug.cgi?id=103528

Reclosing this one.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.