Bug 54779 - Live lock on daemon startup when module combine is used and real time scheduling is enabled
Summary: Live lock on daemon startup when module combine is used and real time schedul...
Status: RESOLVED FIXED
Alias: None
Product: PulseAudio
Classification: Unclassified
Component: modules (show other bugs)
Version: unspecified
Hardware: Other All
: medium major
Assignee: pulseaudio-bugs
QA Contact: pulseaudio-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-11 15:06 UTC by Tvrtko Ursulin
Modified: 2012-10-30 14:42 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
strace of the whole PA server when live locking (24.62 KB, application/x-bzip2)
2012-09-12 09:42 UTC, Tvrtko Ursulin
Details
Workaround for avoiding the live lock (573 bytes, patch)
2012-09-12 10:41 UTC, Tvrtko Ursulin
Details | Splinter Review
Fix candidate (2.21 KB, patch)
2012-09-20 06:45 UTC, Tanu Kaskinen
Details | Splinter Review

Description Tvrtko Ursulin 2012-09-11 15:06:32 UTC
Reproduced with version 0.9.23 and 1.1.

I think it is easier to trigger this on slow machines. I get it 100% of time on a single core Celeron G440, while with a i3 CPU on the same motherboard it happens only occasionally.

Setting "realtime-scheduling" to "no" in /etc/pulse/daemon.conf makes it work.

daemon.conf:

allow-module-loading = no
allow-exit = no
high-priority = yes
nice-level = -11
realtime-scheduling = no
rlimit-rttime = 10000000
default-sample-format = s16le
default-sample-rate = 48000
default-sample-channels = 2
default-channel-map = front-left,front-right
default-fragment-size-msec = 10
resample-method = speex-float-0

default.pa:

load-module module-alsa-sink device=hw:0,7
load-module module-alsa-sink device=hw:0,3
load-module module-alsa-sink device=hw:0,1
load-module module-alsa-sink device=hw:0,0
load-module module-combine slaves=alsa_output.hw_0_7,alsa_output.hw_0_3,alsa_output.hw_0_1,alsa_output.hw_0_0

set-default-sink combined

load-module module-native-protocol-unix
load-module module-rescue-streams
load-module module-always-sink
load-module module-intended-roles
load-module module-suspend-on-idle
load-module module-console-kit

Happens also with only one slave.

Core dump while in live lock state attached.

Threads call traces look like this:

gdb) where
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007ffff74f4e48 in pa_semaphore_wait (s=0x625e10) at pulsecore/semaphore-posix.c:63
#2  0x00007ffff7b7216a in pa_asyncmsgq_send (a=0x623ad0, object=<optimized out>, code=<optimized out>, userdata=<optimized out>, offset=<optimized out>, chunk=<optimized out>) at pulsecore/asyncmsgq.c:169
#3  0x00007ffff7ba6ea7 in pa_sink_get_latency_range (s=0x637630, min_latency=0x7fffffffd830, max_latency=0x7fffffffd838) at pulsecore/sink.c:2399
#4  0x00007ffff7b9c2cd in pa_sink_input_set_requested_latency (i=0x65cab0, usec=200000) at pulsecore/sink-input.c:932
#5  0x00007fffe5cdf3ee in output_create_sink_input (o=0x6524f0) at modules/module-combine.c:883
#6  0x00007fffe5ce04f9 in output_enable (o=0x6524f0) at modules/module-combine.c:961
#7  output_enable (o=0x6524f0) at modules/module-combine.c:949
#8  0x00007fffe5ce28a0 in module_combine_LTX_pa__init (m=0x64b500) at modules/module-combine.c:1347
#9  0x00007ffff7b87068 in pa_module_load (c=0x6192f0, name=0x61a0d0 "module-combine", argument=0x64b4a0 "slaves=alsa_output.hw_0_7,alsa_output.hw_0_3,alsa_output.hw_0_1,alsa_output.hw_0_0") at pulsecore/module.c:112
#10 0x00007ffff7b7971d in pa_cli_command_load (c=0x6192f0, t=0x615f50, buf=0x6156e0, fail=<optimized out>) at pulsecore/cli-command.c:426
#11 0x00007ffff7b7af26 in pa_cli_command_execute_line_stateful (c=0x6192f0, s=<optimized out>, buf=0x6156e0, fail=0x613065, ifstate=<optimized out>) at pulsecore/cli-command.c:1744
#12 0x00007ffff7b7b269 in pa_cli_command_execute_file_stream (c=0x6192f0, f=0x620700, buf=0x6156e0, fail=0x613065) at pulsecore/cli-command.c:1784
#13 0x0000000000405b97 in main (argc=<optimized out>, argv=<optimized out>) at daemon/main.c:930
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fffe7ee9700 (LWP 30992))]
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
58                                     _NSIG / 8);
(gdb) where
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
#1  0x00007ffff7b8f749 in pa_rtpoll_run (p=0x6160a0, wait_op=true) at pulsecore/rtpoll.c:304
#2  0x00007fffe8408971 in thread_func (userdata=0x623640) at modules/alsa/alsa-sink.c:1470
#3  0x00007ffff74f4be8 in internal_thread_func (userdata=0x6373a0) at pulsecore/thread-posix.c:83
#4  0x00007ffff54e3d90 in start_thread (arg=0x7fffe7ee9700) at pthread_create.c:309
#5  0x00007ffff4b84f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fffe76e8700 (LWP 30993))]
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
58                                     _NSIG / 8);
(gdb) where
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
#1  0x00007ffff7b8f749 in pa_rtpoll_run (p=0x625b00, wait_op=true) at pulsecore/rtpoll.c:304
#2  0x00007fffe8408971 in thread_func (userdata=0x636bd0) at modules/alsa/alsa-sink.c:1470
#3  0x00007ffff74f4be8 in internal_thread_func (userdata=0x637e20) at pulsecore/thread-posix.c:83
#4  0x00007ffff54e3d90 in start_thread (arg=0x7fffe76e8700) at pthread_create.c:309
#5  0x00007ffff4b84f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) thread 4
[Switching to thread 4 (Thread 0x7fffe6ee7700 (LWP 30994))]
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
58                                     _NSIG / 8);
(gdb) where
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
#1  0x00007ffff7b8f749 in pa_rtpoll_run (p=0x63afb0, wait_op=true) at pulsecore/rtpoll.c:304
#2  0x00007fffe8408971 in thread_func (userdata=0x63e430) at modules/alsa/alsa-sink.c:1470
#3  0x00007ffff74f4be8 in internal_thread_func (userdata=0x63e9c0) at pulsecore/thread-posix.c:83
#4  0x00007ffff54e3d90 in start_thread (arg=0x7fffe6ee7700) at pthread_create.c:309
#5  0x00007ffff4b84f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) thread 5
[Switching to thread 5 (Thread 0x7fffe66e6700 (LWP 30995))]
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
58                                     _NSIG / 8);
(gdb) where
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
#1  0x00007ffff7b8f749 in pa_rtpoll_run (p=0x63e800, wait_op=true) at pulsecore/rtpoll.c:304
#2  0x00007fffe8408971 in thread_func (userdata=0x644c30) at modules/alsa/alsa-sink.c:1470
#3  0x00007ffff74f4be8 in internal_thread_func (userdata=0x64b870) at pulsecore/thread-posix.c:83
#4  0x00007ffff54e3d90 in start_thread (arg=0x7fffe66e6700) at pthread_create.c:309
#5  0x00007ffff4b84f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) thread 6
[Switching to thread 6 (Thread 0x7fffe5cdb700 (LWP 30996))]
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
58                                     _NSIG / 8);
(gdb) where
#0  0x00007ffff4b7c693 in ppoll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/ppoll.c:58
#1  0x00007ffff7b8f749 in pa_rtpoll_run (p=0x6445a0, wait_op=true) at pulsecore/rtpoll.c:304
#2  0x00007fffe5ce070e in thread_func (userdata=0x64b790) at modules/module-combine.c:333
#3  0x00007ffff74f4be8 in internal_thread_func (userdata=0x659f00) at pulsecore/thread-posix.c:83
#4  0x00007ffff54e3d90 in start_thread (arg=0x7fffe5cdb700) at pthread_create.c:309
#5  0x00007ffff4b84f5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Comment 1 Tvrtko Ursulin 2012-09-11 15:08:12 UTC
Coredump (compressed) at 27Mb is to big for Bugzilla limits.
Comment 2 Tvrtko Ursulin 2012-09-11 15:35:06 UTC
Forgot about debug log:

-bash-4.2$ pulseaudio -vvvvv --log-target=stderr
I: [pulseaudio] main.c: setrlimit(RLIMIT_NICE, (31, 31)) failed: Operation not permitted
D: [pulseaudio] core-rtclock.c: Timer slack is set to 50 us.
I: [pulseaudio] core-util.c: Failed to acquire high-priority scheduling: Permission denied
I: [pulseaudio] main.c: This is PulseAudio 1.1
D: [pulseaudio] main.c: Compilation host: x86_64-unknown-linux-gnu
D: [pulseaudio] main.c: Compilation CFLAGS: -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4  -m64 -mtune=generic -W -Wextra -Wno-long-long -Wvla -Wno-overlength-strings -Wunsafe-loop-optimizations -Wundef -Wformat=2 -Wlogical-op -Wsign-compare -Wformat-security -Wmissing-include-dirs -Wformat-nonliteral -Wpointer-arith -Winit-self -Wdeclaration-after-statement -Wfloat-equal -Wmissing-prototypes -Wredundant-decls -Wmissing-declarations -Wmissing-noreturn -Wshadow -Wendif-labels -Wcast-align -Wstrict-aliasing -Wwrite-strings -Wno-unused-parameter -ffast-math -fno-common -fdiagnostics-show-option
D: [pulseaudio] main.c: Running on host: Linux x86_64 3.5.0-92.fc16.x86_64 #1 SMP Wed Aug 22 11:54:04 BST 2012
D: [pulseaudio] main.c: Found 1 CPUs.
I: [pulseaudio] main.c: Page size is 4096 bytes
D: [pulseaudio] main.c: Compiled with Valgrind support: no
D: [pulseaudio] main.c: Running in valgrind mode: no
D: [pulseaudio] main.c: Running in VM: no
D: [pulseaudio] main.c: Optimized build: yes
D: [pulseaudio] main.c: All asserts enabled.
I: [pulseaudio] main.c: Machine ID is 5dcad513e4ad20b1145fc21c0000000b.
I: [pulseaudio] main.c: Using runtime directory /data/data/app_home/.pulse/5dcad513e4ad20b1145fc21c0000000b-runtime.
I: [pulseaudio] main.c: Using state directory /data/data/app_home/.pulse.
I: [pulseaudio] main.c: Using modules directory /usr/lib64/pulse-1.1/modules.
I: [pulseaudio] main.c: Running in system mode: no
I: [pulseaudio] main.c: Fresh high-resolution timers available! Bon appetit!
D: [pulseaudio] memblock.c: Using shared memory pool with 1024 slots of size 64.0 KiB each, total size is 64.0 MiB, maximum usable slot size is 65472
I: [pulseaudio] cpu-x86.c: CPU flags: CMOV MMX SSE SSE2 SSE3 SSSE3 SSE4_1 SSE4_2 
I: [pulseaudio] svolume_mmx.c: Initialising MMX optimized volume functions.
I: [pulseaudio] remap_mmx.c: Initialising MMX optimized remappers.
I: [pulseaudio] svolume_sse.c: Initialising SSE2 optimized volume functions.
I: [pulseaudio] remap_sse.c: Initialising SSE2 optimized remappers.
I: [pulseaudio] sconv_sse.c: Initialising SSE2 optimized conversions.
D: [pulseaudio] alsa-util.c: Trying hw:0,7 with SND_PCM_NO_AUTO_FORMAT ...
D: [pulseaudio] alsa-util.c: Managed to open hw:0,7
I: [pulseaudio] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
D: [pulseaudio] alsa-util.c: Maximum hw buffer size is 21845 ms
D: [pulseaudio] alsa-util.c: Set buffer size first (to 96000 samples), period size second (to 96000 samples).
I: [pulseaudio] alsa-util.c: ALSA period wakeups disabled
I: [pulseaudio] alsa-sink.c: Successfully opened device hw:0,7.
I: [pulseaudio] alsa-sink.c: Successfully enabled mmap() mode.
I: [pulseaudio] alsa-sink.c: Successfully enabled timer-based scheduling mode.
I: [pulseaudio] sink.c: Created sink 0 "alsa_output.hw_0_7" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] sink.c:     alsa.resolution_bits = "16"
I: [pulseaudio] sink.c:     device.api = "alsa"
I: [pulseaudio] sink.c:     device.class = "sound"
I: [pulseaudio] sink.c:     alsa.class = "generic"
I: [pulseaudio] sink.c:     alsa.subclass = "generic-mix"
I: [pulseaudio] sink.c:     alsa.name = "HDMI 1"
I: [pulseaudio] sink.c:     alsa.id = "HDMI 1"
I: [pulseaudio] sink.c:     alsa.subdevice = "0"
I: [pulseaudio] sink.c:     alsa.subdevice_name = "subdevice #0"
I: [pulseaudio] sink.c:     alsa.device = "7"
I: [pulseaudio] sink.c:     alsa.card = "0"
I: [pulseaudio] sink.c:     alsa.card_name = "HDA Intel PCH"
I: [pulseaudio] sink.c:     alsa.long_card_name = "HDA Intel PCH at 0xfe520000 irq 47"
I: [pulseaudio] sink.c:     alsa.driver_name = "snd_hda_intel"
I: [pulseaudio] sink.c:     device.bus_path = "pci-0000:00:1b.0"
I: [pulseaudio] sink.c:     sysfs.path = "/devices/pci0000:00/0000:00:1b.0/sound/card0"
I: [pulseaudio] sink.c:     device.bus = "pci"
I: [pulseaudio] sink.c:     device.vendor.id = "8086"
I: [pulseaudio] sink.c:     device.vendor.name = "Intel Corporation"
I: [pulseaudio] sink.c:     device.product.id = "1c20"
I: [pulseaudio] sink.c:     device.product.name = "6 Series/C200 Series Chipset Family High Definition Audio Controller"
I: [pulseaudio] sink.c:     device.form_factor = "internal"
I: [pulseaudio] sink.c:     device.string = "hw:0,7"
I: [pulseaudio] sink.c:     device.buffering.buffer_size = "384000"
I: [pulseaudio] sink.c:     device.buffering.fragment_size = "192000"
I: [pulseaudio] sink.c:     device.access_mode = "mmap+timer"
I: [pulseaudio] sink.c:     device.description = "Internal Audio"
I: [pulseaudio] sink.c:     device.icon_name = "audio-card-pci"
I: [pulseaudio] source.c: Created source 0 "alsa_output.hw_0_7.monitor" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] source.c:     device.description = "Monitor of Internal Audio"
I: [pulseaudio] source.c:     device.class = "monitor"
I: [pulseaudio] source.c:     device.icon_name = "audio-input-microphone"
I: [pulseaudio] alsa-sink.c: Using 2.0 fragments of size 192000 bytes (1000.00ms), buffer size is 384000 bytes (2000.00ms)
I: [pulseaudio] alsa-sink.c: Time scheduling watermark is 20.00ms
D: [pulseaudio] alsa-sink.c: hwbuf_unused=0
D: [pulseaudio] alsa-sink.c: setting avail_min=95041
D: [pulseaudio] alsa-util.c: snd_pcm_dump():
D: [pulseaudio] alsa-util.c: Hardware PCM card 0 'HDA Intel PCH' device 7 subdevice 0
D: [pulseaudio] alsa-util.c: Its setup is:
D: [pulseaudio] alsa-util.c:   stream       : PLAYBACK
D: [pulseaudio] alsa-util.c:   access       : MMAP_INTERLEAVED
D: [pulseaudio] alsa-util.c:   format       : S16_LE
D: [pulseaudio] alsa-util.c:   subformat    : STD
D: [pulseaudio] alsa-util.c:   channels     : 2
D: [pulseaudio] alsa-util.c:   rate         : 48000
D: [pulseaudio] alsa-util.c:   exact rate   : 48000 (48000/1)
D: [pulseaudio] alsa-util.c:   msbits       : 16
D: [pulseaudio] alsa-util.c:   buffer_size  : 96000
D: [pulseaudio] alsa-util.c:   period_size  : 48000
D: [pulseaudio] alsa-util.c:   period_time  : 1000000
D: [pulseaudio] alsa-util.c:   tstamp_mode  : ENABLE
D: [pulseaudio] alsa-util.c:   period_step  : 1
D: [pulseaudio] alsa-util.c:   avail_min    : 95041
D: [pulseaudio] alsa-util.c:   period_event : 0
D: [pulseaudio] alsa-util.c:   start_threshold  : -1
D: [pulseaudio] alsa-util.c:   stop_threshold   : 6755399441055744000
D: [pulseaudio] alsa-util.c:   silence_threshold: 0
D: [pulseaudio] alsa-util.c:   silence_size : 0
D: [pulseaudio] alsa-util.c:   boundary     : 6755399441055744000
D: [pulseaudio] alsa-util.c:   appl_ptr     : 0
D: [pulseaudio] alsa-util.c:   hw_ptr       : 0
D: [alsa-sink] alsa-sink.c: Thread starting up
D: [alsa-sink] core-util.c: SCHED_RR|SCHED_RESET_ON_FORK worked.
I: [alsa-sink] core-util.c: Successfully enabled SCHED_RR scheduling for thread, with priority 5.
I: [alsa-sink] alsa-sink.c: Starting playback.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
I: [pulseaudio] module.c: Loaded "module-alsa-sink" (index: #0; argument: "device=hw:0,7").
D: [pulseaudio] alsa-util.c: Trying hw:0,3 with SND_PCM_NO_AUTO_FORMAT ...
D: [pulseaudio] alsa-util.c: Managed to open hw:0,3
I: [pulseaudio] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
D: [pulseaudio] alsa-util.c: Maximum hw buffer size is 21845 ms
D: [pulseaudio] alsa-util.c: Set buffer size first (to 96000 samples), period size second (to 96000 samples).
I: [pulseaudio] alsa-util.c: ALSA period wakeups disabled
I: [pulseaudio] alsa-sink.c: Successfully opened device hw:0,3.
I: [pulseaudio] alsa-sink.c: Successfully enabled mmap() mode.
I: [pulseaudio] alsa-sink.c: Successfully enabled timer-based scheduling mode.
I: [pulseaudio] sink.c: Created sink 1 "alsa_output.hw_0_3" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] sink.c:     alsa.resolution_bits = "16"
I: [pulseaudio] sink.c:     device.api = "alsa"
I: [pulseaudio] sink.c:     device.class = "sound"
I: [pulseaudio] sink.c:     alsa.class = "generic"
I: [pulseaudio] sink.c:     alsa.subclass = "generic-mix"
I: [pulseaudio] sink.c:     alsa.name = "HDMI 0"
I: [pulseaudio] sink.c:     alsa.id = "HDMI 0"
I: [pulseaudio] sink.c:     alsa.subdevice = "0"
I: [pulseaudio] sink.c:     alsa.subdevice_name = "subdevice #0"
I: [pulseaudio] sink.c:     alsa.device = "3"
I: [pulseaudio] sink.c:     alsa.card = "0"
I: [pulseaudio] sink.c:     alsa.card_name = "HDA Intel PCH"
I: [pulseaudio] sink.c:     alsa.long_card_name = "HDA Intel PCH at 0xfe520000 irq 47"
I: [pulseaudio] sink.c:     alsa.driver_name = "snd_hda_intel"
I: [pulseaudio] sink.c:     device.bus_path = "pci-0000:00:1b.0"
I: [pulseaudio] sink.c:     sysfs.path = "/devices/pci0000:00/0000:00:1b.0/sound/card0"
I: [pulseaudio] sink.c:     device.bus = "pci"
I: [pulseaudio] sink.c:     device.vendor.id = "8086"
I: [pulseaudio] sink.c:     device.vendor.name = "Intel Corporation"
I: [pulseaudio] sink.c:     device.product.id = "1c20"
I: [pulseaudio] sink.c:     device.product.name = "6 Series/C200 Series Chipset Family High Definition Audio Controller"
I: [pulseaudio] sink.c:     device.form_factor = "internal"
I: [pulseaudio] sink.c:     device.string = "hw:0,3"
I: [pulseaudio] sink.c:     device.buffering.buffer_size = "384000"
I: [pulseaudio] sink.c:     device.buffering.fragment_size = "192000"
I: [pulseaudio] sink.c:     device.access_mode = "mmap+timer"
I: [pulseaudio] sink.c:     device.description = "Internal Audio"
I: [pulseaudio] sink.c:     device.icon_name = "audio-card-pci"
I: [pulseaudio] source.c: Created source 1 "alsa_output.hw_0_3.monitor" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] source.c:     device.description = "Monitor of Internal Audio"
I: [pulseaudio] source.c:     device.class = "monitor"
I: [pulseaudio] source.c:     device.icon_name = "audio-input-microphone"
I: [pulseaudio] alsa-sink.c: Using 2.0 fragments of size 192000 bytes (1000.00ms), buffer size is 384000 bytes (2000.00ms)
I: [pulseaudio] alsa-sink.c: Time scheduling watermark is 20.00ms
D: [pulseaudio] alsa-sink.c: hwbuf_unused=0
D: [pulseaudio] alsa-sink.c: setting avail_min=95041
D: [pulseaudio] alsa-util.c: snd_pcm_dump():
D: [pulseaudio] alsa-util.c: Hardware PCM card 0 'HDA Intel PCH' device 3 subdevice 0
D: [pulseaudio] alsa-util.c: Its setup is:
D: [pulseaudio] alsa-util.c:   stream       : PLAYBACK
D: [pulseaudio] alsa-util.c:   access       : MMAP_INTERLEAVED
D: [pulseaudio] alsa-util.c:   format       : S16_LE
D: [pulseaudio] alsa-util.c:   subformat    : STD
D: [pulseaudio] alsa-util.c:   channels     : 2
D: [pulseaudio] alsa-util.c:   rate         : 48000
D: [pulseaudio] alsa-util.c:   exact rate   : 48000 (48000/1)
D: [pulseaudio] alsa-util.c:   msbits       : 16
D: [pulseaudio] alsa-util.c:   buffer_size  : 96000
D: [pulseaudio] alsa-util.c:   period_size  : 48000
D: [pulseaudio] alsa-util.c:   period_time  : 1000000
D: [pulseaudio] alsa-util.c:   tstamp_mode  : ENABLE
D: [pulseaudio] alsa-util.c:   period_step  : 1
D: [pulseaudio] alsa-util.c:   avail_min    : 95041
D: [pulseaudio] alsa-util.c:   period_event : 0
D: [pulseaudio] alsa-util.c:   start_threshold  : -1
D: [pulseaudio] alsa-util.c:   stop_threshold   : 6755399441055744000
D: [pulseaudio] alsa-util.c:   silence_threshold: 0
D: [pulseaudio] alsa-util.c:   silence_size : 0
D: [pulseaudio] alsa-util.c:   boundary     : 6755399441055744000
D: [pulseaudio] alsa-util.c:   appl_ptr     : 0
D: [pulseaudio] alsa-util.c:   hw_ptr       : 0
D: [alsa-sink] alsa-sink.c: Thread starting up
D: [alsa-sink] core-util.c: SCHED_RR|SCHED_RESET_ON_FORK worked.
I: [alsa-sink] core-util.c: Successfully enabled SCHED_RR scheduling for thread, with priority 5.
I: [alsa-sink] alsa-sink.c: Starting playback.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
I: [pulseaudio] module.c: Loaded "module-alsa-sink" (index: #1; argument: "device=hw:0,3").
D: [pulseaudio] alsa-util.c: Trying hw:0,1 with SND_PCM_NO_AUTO_FORMAT ...
D: [pulseaudio] alsa-util.c: Managed to open hw:0,1
I: [pulseaudio] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
D: [pulseaudio] alsa-util.c: Maximum hw buffer size is 21845 ms
D: [pulseaudio] alsa-util.c: Set buffer size first (to 96000 samples), period size second (to 96000 samples).
I: [pulseaudio] alsa-util.c: ALSA period wakeups disabled
I: [pulseaudio] alsa-sink.c: Successfully opened device hw:0,1.
I: [pulseaudio] alsa-sink.c: Successfully enabled mmap() mode.
I: [pulseaudio] alsa-sink.c: Successfully enabled timer-based scheduling mode.
I: [pulseaudio] sink.c: Created sink 2 "alsa_output.hw_0_1" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] sink.c:     alsa.resolution_bits = "16"
I: [pulseaudio] sink.c:     device.api = "alsa"
I: [pulseaudio] sink.c:     device.class = "sound"
I: [pulseaudio] sink.c:     alsa.class = "generic"
I: [pulseaudio] sink.c:     alsa.subclass = "generic-mix"
I: [pulseaudio] sink.c:     alsa.name = "ALC892 Digital"
I: [pulseaudio] sink.c:     alsa.id = "ALC892 Digital"
I: [pulseaudio] sink.c:     alsa.subdevice = "0"
I: [pulseaudio] sink.c:     alsa.subdevice_name = "subdevice #0"
I: [pulseaudio] sink.c:     alsa.device = "1"
I: [pulseaudio] sink.c:     alsa.card = "0"
I: [pulseaudio] sink.c:     alsa.card_name = "HDA Intel PCH"
I: [pulseaudio] sink.c:     alsa.long_card_name = "HDA Intel PCH at 0xfe520000 irq 47"
I: [pulseaudio] sink.c:     alsa.driver_name = "snd_hda_intel"
I: [pulseaudio] sink.c:     device.bus_path = "pci-0000:00:1b.0"
I: [pulseaudio] sink.c:     sysfs.path = "/devices/pci0000:00/0000:00:1b.0/sound/card0"
I: [pulseaudio] sink.c:     device.bus = "pci"
I: [pulseaudio] sink.c:     device.vendor.id = "8086"
I: [pulseaudio] sink.c:     device.vendor.name = "Intel Corporation"
I: [pulseaudio] sink.c:     device.product.id = "1c20"
I: [pulseaudio] sink.c:     device.product.name = "6 Series/C200 Series Chipset Family High Definition Audio Controller"
I: [pulseaudio] sink.c:     device.form_factor = "internal"
I: [pulseaudio] sink.c:     device.string = "hw:0,1"
I: [pulseaudio] sink.c:     device.buffering.buffer_size = "384000"
I: [pulseaudio] sink.c:     device.buffering.fragment_size = "192000"
I: [pulseaudio] sink.c:     device.access_mode = "mmap+timer"
I: [pulseaudio] sink.c:     device.description = "Internal Audio"
I: [pulseaudio] sink.c:     device.icon_name = "audio-card-pci"
I: [pulseaudio] source.c: Created source 2 "alsa_output.hw_0_1.monitor" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] source.c:     device.description = "Monitor of Internal Audio"
I: [pulseaudio] source.c:     device.class = "monitor"
I: [pulseaudio] source.c:     device.icon_name = "audio-input-microphone"
I: [pulseaudio] alsa-sink.c: Using 2.0 fragments of size 192000 bytes (1000.00ms), buffer size is 384000 bytes (2000.00ms)
I: [pulseaudio] alsa-sink.c: Time scheduling watermark is 20.00ms
D: [pulseaudio] alsa-sink.c: hwbuf_unused=0
D: [pulseaudio] alsa-sink.c: setting avail_min=95041
D: [pulseaudio] alsa-util.c: snd_pcm_dump():
D: [pulseaudio] alsa-util.c: Hardware PCM card 0 'HDA Intel PCH' device 1 subdevice 0
D: [pulseaudio] alsa-util.c: Its setup is:
D: [pulseaudio] alsa-util.c:   stream       : PLAYBACK
D: [pulseaudio] alsa-util.c:   access       : MMAP_INTERLEAVED
D: [pulseaudio] alsa-util.c:   format       : S16_LE
D: [pulseaudio] alsa-util.c:   subformat    : STD
D: [pulseaudio] alsa-util.c:   channels     : 2
D: [pulseaudio] alsa-util.c:   rate         : 48000
D: [pulseaudio] alsa-util.c:   exact rate   : 48000 (48000/1)
D: [pulseaudio] alsa-util.c:   msbits       : 16
D: [pulseaudio] alsa-util.c:   buffer_size  : 96000
D: [pulseaudio] alsa-util.c:   period_size  : 48000
D: [pulseaudio] alsa-util.c:   period_time  : 1000000
D: [pulseaudio] alsa-util.c:   tstamp_mode  : ENABLE
D: [pulseaudio] alsa-util.c:   period_step  : 1
D: [pulseaudio] alsa-util.c:   avail_min    : 95041
D: [pulseaudio] alsa-util.c:   period_event : 0
D: [pulseaudio] alsa-util.c:   start_threshold  : -1
D: [pulseaudio] alsa-util.c:   stop_threshold   : 6755399441055744000
D: [pulseaudio] alsa-util.c:   silence_threshold: 0
D: [pulseaudio] alsa-util.c:   silence_size : 0
D: [pulseaudio] alsa-util.c:   boundary     : 6755399441055744000
D: [pulseaudio] alsa-util.c:   appl_ptr     : 0
D: [pulseaudio] alsa-util.c:   hw_ptr       : 0
D: [alsa-sink] alsa-sink.c: Thread starting up
D: [alsa-sink] core-util.c: SCHED_RR|SCHED_RESET_ON_FORK worked.
I: [alsa-sink] core-util.c: Successfully enabled SCHED_RR scheduling for thread, with priority 5.
I: [alsa-sink] alsa-sink.c: Starting playback.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
I: [pulseaudio] module.c: Loaded "module-alsa-sink" (index: #2; argument: "device=hw:0,1").
D: [pulseaudio] alsa-util.c: Trying hw:0,0 with SND_PCM_NO_AUTO_FORMAT ...
D: [pulseaudio] alsa-util.c: Managed to open hw:0,0
I: [pulseaudio] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
D: [pulseaudio] alsa-util.c: Maximum hw buffer size is 21845 ms
D: [pulseaudio] alsa-util.c: Set buffer size first (to 96000 samples), period size second (to 96000 samples).
I: [pulseaudio] alsa-util.c: ALSA period wakeups disabled
I: [pulseaudio] alsa-sink.c: Successfully opened device hw:0,0.
I: [pulseaudio] alsa-sink.c: Successfully enabled mmap() mode.
I: [pulseaudio] alsa-sink.c: Successfully enabled timer-based scheduling mode.
I: [pulseaudio] sink.c: Created sink 3 "alsa_output.hw_0_0" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] sink.c:     alsa.resolution_bits = "16"
I: [pulseaudio] sink.c:     device.api = "alsa"
I: [pulseaudio] sink.c:     device.class = "sound"
I: [pulseaudio] sink.c:     alsa.class = "generic"
I: [pulseaudio] sink.c:     alsa.subclass = "generic-mix"
I: [pulseaudio] sink.c:     alsa.name = "ALC892 Analog"
I: [pulseaudio] sink.c:     alsa.id = "ALC892 Analog"
I: [pulseaudio] sink.c:     alsa.subdevice = "0"
I: [pulseaudio] sink.c:     alsa.subdevice_name = "subdevice #0"
I: [pulseaudio] sink.c:     alsa.device = "0"
I: [pulseaudio] sink.c:     alsa.card = "0"
I: [pulseaudio] sink.c:     alsa.card_name = "HDA Intel PCH"
I: [pulseaudio] sink.c:     alsa.long_card_name = "HDA Intel PCH at 0xfe520000 irq 47"
I: [pulseaudio] sink.c:     alsa.driver_name = "snd_hda_intel"
I: [pulseaudio] sink.c:     device.bus_path = "pci-0000:00:1b.0"
I: [pulseaudio] sink.c:     sysfs.path = "/devices/pci0000:00/0000:00:1b.0/sound/card0"
I: [pulseaudio] sink.c:     device.bus = "pci"
I: [pulseaudio] sink.c:     device.vendor.id = "8086"
I: [pulseaudio] sink.c:     device.vendor.name = "Intel Corporation"
I: [pulseaudio] sink.c:     device.product.id = "1c20"
I: [pulseaudio] sink.c:     device.product.name = "6 Series/C200 Series Chipset Family High Definition Audio Controller"
I: [pulseaudio] sink.c:     device.form_factor = "internal"
I: [pulseaudio] sink.c:     device.string = "hw:0,0"
I: [pulseaudio] sink.c:     device.buffering.buffer_size = "384000"
I: [pulseaudio] sink.c:     device.buffering.fragment_size = "192000"
I: [pulseaudio] sink.c:     device.access_mode = "mmap+timer"
I: [pulseaudio] sink.c:     device.description = "Internal Audio"
I: [pulseaudio] sink.c:     device.icon_name = "audio-card-pci"
I: [pulseaudio] source.c: Created source 3 "alsa_output.hw_0_0.monitor" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] source.c:     device.description = "Monitor of Internal Audio"
I: [pulseaudio] source.c:     device.class = "monitor"
I: [pulseaudio] source.c:     device.icon_name = "audio-input-microphone"
I: [pulseaudio] alsa-sink.c: Using 2.0 fragments of size 192000 bytes (1000.00ms), buffer size is 384000 bytes (2000.00ms)
I: [pulseaudio] alsa-sink.c: Time scheduling watermark is 20.00ms
D: [pulseaudio] alsa-sink.c: hwbuf_unused=0
D: [pulseaudio] alsa-sink.c: setting avail_min=95041
D: [pulseaudio] alsa-util.c: snd_pcm_dump():
D: [pulseaudio] alsa-util.c: Hardware PCM card 0 'HDA Intel PCH' device 0 subdevice 0
D: [pulseaudio] alsa-util.c: Its setup is:
D: [pulseaudio] alsa-util.c:   stream       : PLAYBACK
D: [pulseaudio] alsa-util.c:   access       : MMAP_INTERLEAVED
D: [pulseaudio] alsa-util.c:   format       : S16_LE
D: [pulseaudio] alsa-util.c:   subformat    : STD
D: [pulseaudio] alsa-util.c:   channels     : 2
D: [pulseaudio] alsa-util.c:   rate         : 48000
D: [pulseaudio] alsa-util.c:   exact rate   : 48000 (48000/1)
D: [pulseaudio] alsa-util.c:   msbits       : 16
D: [pulseaudio] alsa-util.c:   buffer_size  : 96000
D: [pulseaudio] alsa-util.c:   period_size  : 48000
D: [pulseaudio] alsa-util.c:   period_time  : 1000000
D: [pulseaudio] alsa-util.c:   tstamp_mode  : ENABLE
D: [pulseaudio] alsa-util.c:   period_step  : 1
D: [pulseaudio] alsa-util.c:   avail_min    : 95041
D: [pulseaudio] alsa-util.c:   period_event : 0
D: [pulseaudio] alsa-util.c:   start_threshold  : -1
D: [pulseaudio] alsa-util.c:   stop_threshold   : 6755399441055744000
D: [pulseaudio] alsa-util.c:   silence_threshold: 0
D: [pulseaudio] alsa-util.c:   silence_size : 0
D: [pulseaudio] alsa-util.c:   boundary     : 6755399441055744000
D: [pulseaudio] alsa-util.c:   appl_ptr     : 0
D: [pulseaudio] alsa-util.c:   hw_ptr       : 0
D: [alsa-sink] alsa-sink.c: Thread starting up
D: [alsa-sink] core-util.c: SCHED_RR|SCHED_RESET_ON_FORK worked.
I: [alsa-sink] core-util.c: Successfully enabled SCHED_RR scheduling for thread, with priority 5.
I: [alsa-sink] alsa-sink.c: Starting playback.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
D: [alsa-sink] alsa-sink.c: Cutting sleep time for the initial iterations by half.
I: [pulseaudio] module.c: Loaded "module-alsa-sink" (index: #3; argument: "device=hw:0,0").
W: [pulseaudio] module.c: module-combine is deprecated: Please use module-combine-sink instead of module-combine!
W: [pulseaudio] module-combine.c: We will now load module-combine-sink. Please make sure to remove module-combine from your configuration.
I: [pulseaudio] sink.c: Created sink 4 "combined" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] sink.c:     device.class = "filter"
I: [pulseaudio] sink.c:     combine.slaves = "alsa_output.hw_0_7,alsa_output.hw_0_3,alsa_output.hw_0_1,alsa_output.hw_0_0"
I: [pulseaudio] sink.c:     device.description = "Simultaneous Output"
I: [pulseaudio] sink.c:     device.icon_name = "audio-card"
I: [pulseaudio] source.c: Created source 4 "combined.monitor" with sample spec s16le 2ch 48000Hz and channel map front-left,front-right
I: [pulseaudio] source.c:     device.description = "Monitor of Simultaneous Output"
I: [pulseaudio] source.c:     device.class = "monitor"
I: [pulseaudio] source.c:     device.icon_name = "audio-input-microphone"
D: [pulseaudio] memblockq.c: memblockq requested: maxlength=16777216, tlength=16777216, base=4, prebuf=1, minreq=0 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq sanitized: maxlength=16777216, tlength=16777216, base=4, prebuf=4, minreq=4 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq requested: maxlength=16777216, tlength=16777216, base=4, prebuf=1, minreq=0 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq sanitized: maxlength=16777216, tlength=16777216, base=4, prebuf=4, minreq=4 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq requested: maxlength=16777216, tlength=16777216, base=4, prebuf=1, minreq=0 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq sanitized: maxlength=16777216, tlength=16777216, base=4, prebuf=4, minreq=4 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq requested: maxlength=16777216, tlength=16777216, base=4, prebuf=1, minreq=0 maxrewind=0
D: [pulseaudio] memblockq.c: memblockq sanitized: maxlength=16777216, tlength=16777216, base=4, prebuf=4, minreq=4 maxrewind=0
D: [combine] module-combine-sink.c: Thread starting up
D: [combine] core-util.c: SCHED_RR|SCHED_RESET_ON_FORK worked.
I: [combine] core-util.c: Successfully enabled SCHED_RR scheduling for thread, with priority 6.
Killed
Comment 3 Tvrtko Ursulin 2012-09-11 16:03:32 UTC
It is the "combine" thread which is busy looping. If I attach to it with strace it unwedges itself. Both on startup and also on daemon shutdown.

Startup, combine using 100% CPU:

Process 31916 attached - interrupt to quit
17:00:58.164278 ppoll([{fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 2, {0, 0}, NULL, 8) = 1 ([{fd=30, revents=POLLIN}], left {0, 0})
17:00:58.164649 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:00:58.164891 futex(0x2302f90, FUTEX_WAKE_PRIVATE, 1) = 1
17:00:58.165628 futex(0x2302f90, FUTEX_WAKE_PRIVATE, 1) = 1
17:00:58.166285 ppoll([{fd=40, events=POLLIN}, {fd=36, events=POLLIN}, {fd=37, events=POLLIN}, {fd=33, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, NULL, NULL, 8) = 1 ([{fd=30, revents=POLLIN}])
17:00:58.166451 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:00:58.166575 futex(0x2302f90, FUTEX_WAKE_PRIVATE, 1) = 1
17:00:58.167215 ppoll([{fd=44, events=POLLIN}, {fd=40, events=POLLIN}, {fd=36, events=POLLIN}, {fd=41, events=POLLIN}, {fd=37, events=POLLIN}, {fd=33, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 8, NULL, NULL, 8) = 1 ([{fd=30, revents=POLLIN}])
17:00:58.167378 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:00:58.167500 ppoll([{fd=48, events=POLLIN}, {fd=44, events=POLLIN}, {fd=40, events=POLLIN}, {fd=36, events=POLLIN}, {fd=45, events=POLLIN}, {fd=41, events=POLLIN}, {fd=37, events=POLLIN}, {fd=33, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 10, NULL, NULL, 8) = 1 ([{fd=48, revents=POLLIN}])
17:00:58.168926 read(48, "\1\0\0\0\0\0\0\0", 8) = 8
17:00:58.169034 ppoll([{fd=48, events=POLLIN}, {fd=44, events=POLLIN}, {fd=40, events=POLLIN}, {fd=36, events=POLLIN}, {fd=45, events=POLLIN}, {fd=41, events=POLLIN}, {fd=37, events=POLLIN}, {fd=33, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 10, NULL, NULL, 8^C <unfinished ...>
Process 31916 detached

Daemon is now unwedged and works fine.

Now on daemon shutdown combine threads starts busy looping again. Attaching strace to it:

Process 31916 attached - interrupt to quit
17:01:31.163684 ppoll([{fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
17:01:31.164670 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:01:31.164905 futex(0x2302f90, FUTEX_WAKE_PRIVATE, 1) = 1
17:01:31.165019 ppoll([{fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 2, NULL, NULL, 8) = 1 ([{fd=30, revents=POLLIN}])
17:01:31.165205 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:01:31.165332 ppoll([{fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 2, NULL, NULL, 8) = 1 ([{fd=30, revents=POLLIN}])
17:01:31.165443 read(30, "\1\0\0\0\0\0\0\0", 8) = 8
17:01:31.165569 ioctl(2, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
17:01:31.165712 write(2, "D: [combine] module-combine-sink"..., 57) = 57
17:01:31.165861 madvise(0x7ffbf1af5000, 8368128, MADV_DONTNEED) = 0
17:01:31.165973 _exit(0)                = ?

Unwedged.
Comment 4 Tvrtko Ursulin 2012-09-12 09:42:34 UTC
Created attachment 67029 [details]
strace of the whole PA server when live locking

Looking at the code, it looks like this is going towards pa_fdsem implementation or usage. Trying to debug further...

People who know this code feel free to jump in at any moment, don't be shy. :)
Comment 5 Tvrtko Ursulin 2012-09-12 10:41:19 UTC
Created attachment 67041 [details] [review]
Workaround for avoiding the live lock

So in pa_rtpoll_run timeout for ppoll is calculated with micro-second granularity and upscaled to nano-seconds for the actual ppoll call. 

Going deeper, time accounting is done via pa_rtclock_get and pa_timeval_cmp which internally downscale nano-seconds to micro-seconds.

This means if a sleep time is less than one micro-second, it will come out as zero in pa_rtpoll_run. This will make the RT thread busy loop and thread live lock on startup/shutdown.

This workaround patch works for me by ensuring no busy loop happens if time calculation did run and said zero sleep.

Real patch would I assume be to change the pa_rtclock API to work with timespec (nano-seconds) instead of timeval (micro-seconds). That way this down-scaling error would not happen.
Comment 6 Tanu Kaskinen 2012-09-16 07:40:30 UTC
(In reply to comment #5)
> Created attachment 67041 [details] [review] [review]
> Workaround for avoiding the live lock
> 
> So in pa_rtpoll_run timeout for ppoll is calculated with micro-second
> granularity and upscaled to nano-seconds for the actual ppoll call. 
> 
> Going deeper, time accounting is done via pa_rtclock_get and pa_timeval_cmp
> which internally downscale nano-seconds to micro-seconds.
> 
> This means if a sleep time is less than one micro-second, it will come out as
> zero in pa_rtpoll_run. This will make the RT thread busy loop and thread live
> lock on startup/shutdown.
> 
> This workaround patch works for me by ensuring no busy loop happens if time
> calculation did run and said zero sleep.
> 
> Real patch would I assume be to change the pa_rtclock API to work with timespec
> (nano-seconds) instead of timeval (micro-seconds). That way this down-scaling
> error would not happen.

Thanks for the patch, and especially for the debugging! Is it clear to you why the rounding error causes the busy loop in the combine thread? I don't see why that would happen.
Comment 7 Tvrtko Ursulin 2012-09-17 09:26:48 UTC
Are you asking why the timeout smaller than 1us is constantly being calculated, and as such being zero because of the rounding error? I don't know that, I haven't dug into the why the thread calculates this sleep times in a way it does.

Or are you asking why a busy loop in a combine thread hangs the process? That is because it is a SCHED_RT thread hence other threads do not get to run.
Comment 8 Tanu Kaskinen 2012-09-17 10:44:26 UTC
I'm asking why there is a busy loop in the first place. I have looked at the module-combine-sink IO thread code in detail, and I don't understand why your fix helps the situation.

Regarding the fix in general, I don't think sub-microsecond rounding errors in pa_rtpoll_run() should cause problems in the calling code, so I think module-combine-sink should be fixed instead of fixing pa_rtpoll_run(). The fix is still very interesting, because it works even though it looks to me like it shouldn't be needed.

My main suspicion is that this code is the problem:

    if (!u->thread_info.in_null_mode || u->thread_info.timestamp <= now)
        process_render_null(u, now);

If the "now" variable stays below the timestamp for a prolonged time, then there can be a busy loop. If your fix helps the situation, I think it means that the difference of now and the timestamp is only one microsecond. The "now" variable is updated in every loop iteration with pa_rtclock_now(), so the busy loop shouldn't last for longer than one microsecond (which I think is not enough to get the process killed). Maybe pa_rtclock_now() may actually be less granular than one microsecond, so the busy loop can last longer because pa_rtclock_now() keeps returning the same value for a longer time?

I think this sort of issues could be avoided by not having a strict "timestamp <= now" check here, and instead fill the imaginary buffer even when it's not completely empty. I will make a patch doing that soon, so you can test it, but I'd really like to understand why exactly the busy loop happens, instead of having just vague guesses.
Comment 9 Tvrtko Ursulin 2012-09-17 12:48:41 UTC
I don't know the codebase and have no time to study it in this detail but am happy to test your patch once you have it.

With regards to the duration of a busy loop, my observation is that it lasts tens of seconds or a couple of minutes. I have never witnessed it "unlock" iteself from that state unless I killed it or LimitRTTIME was exceeded which I had set to tens of seconds or a couple of minutes to help me with debugging.

I don't know how I can help you understand why the busy loop happens if you say it is not only because of the rounding error. By all means fix that other cause, but I still rounding error should still be fixed. More so because it looks to me all platforms PA compiles on support <1usec granularity and downscaling and then upscaling the time values because rtclock has a micro-second API seems wrong.
Comment 10 Tanu Kaskinen 2012-09-20 06:44:00 UTC
Ok, it turned out that I wasted time trying to debug this only by looking at the code. I was able to reproduce the bug, and with some debug prints the problem became apparent relatively quickly. Maybe I can learn something from this...

So, the problem: when module-combine-sink is in the null mode (i.e. no outputs, can happen in daemon startup and shutdown) and it's in the IDLE state, the timestamp used in the sleep timer doesn't get updated, but the timer is still kept active. I'll attach a fix that disables the timer in the IDLE state.
Comment 11 Tanu Kaskinen 2012-09-20 06:45:58 UTC
Created attachment 67423 [details] [review]
Fix candidate
Comment 12 Tvrtko Ursulin 2012-09-20 11:21:56 UTC
Hm, your patch does not fix the bug here. I still see it look in poll with a zero timeout.
Comment 13 Tvrtko Ursulin 2012-09-20 11:24:43 UTC
Ups, my bad.. typo in the RPM spec file used to build the package. Give me a moment...
Comment 14 Tvrtko Ursulin 2012-09-20 11:34:09 UTC
Yes, fix looks good after all. I wasn't actually applying it in my first test due to a copy and paste error.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.