Bug 99667 - data race in pa_queue_pop/pa_queue_push reported by TSan
Summary: data race in pa_queue_pop/pa_queue_push reported by TSan
Status: RESOLVED NOTOURBUG
Alias: None
Product: PulseAudio
Classification: Unclassified
Component: core (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: pulseaudio-bugs
QA Contact: pulseaudio-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-03 14:29 UTC by Fabrice Bellet
Modified: 2017-04-14 08:56 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
ThreadSanitizer: data race pulsecore/queue.c:98 in pa_queue_pop (29.82 KB, text/plain)
2017-02-03 14:29 UTC, Fabrice Bellet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Fabrice Bellet 2017-02-03 14:29:45 UTC
Created attachment 129321 [details]
ThreadSanitizer: data race pulsecore/queue.c:98 in pa_queue_pop

This output from ThreadSanitizer suggests a data race with the glib mainloop.

The application is empathy. A pa_glib_mainloop is created in the main thread to monitor the status of the microphone in the user interface, and a gstreamer pipeline is constructed and running. This pipeline contains a pulsesrc element, with its own pa_threaded_mainloop.

The problem, as I understand the output of the sanitizer, is that some events concerning the pa_glib_mainloop are pushed by the pulsesrc thread, and pulled by the main thread, without common locking, causing this possible race.

The first mutex owned by the pulsesrc thread concerns a GstPad, and the second mutex is related to its pa_threaded_mainloop.

Can this race really happen, or should it be considered as a false positive ? Where is locking missing ?
Comment 1 Tanu Kaskinen 2017-04-03 09:40:47 UTC
Sorry for the slow reply...

So empathy has its own pa_context (connection to the pulseaudio daemon) that runs in the main thread, and pulsesrc has another pa_context that runs in a different thread. It looks like empathy mixes these two connections. What does empathy_audio_src_source_output_index_notify() do? It looks like that function is a callback that runs in the pulsesrc thread, and it accesses the main thread's pa_context.
Comment 2 Tanu Kaskinen 2017-04-03 09:54:35 UTC
Actually, I think the other thread is not the thread where pulsesrc's pa_context runs, it's just some internal thread of gstreamer. Empathy has some callback running in that thread, and the callback incorrectly accesses the pa_context object that is supposed to be used only from the main thread.

As a solution, I'd look into sending a message from the gstreamer thread to the main thread (I'm not very familiar with glib, so I don't know exactly how that is done), and do the pa_context access from the main thread. I suppose this could also be solved by adding more locking, but I personally prefer message passing over locks.
Comment 3 Arun Raghavan 2017-04-07 14:58:49 UTC
Looks like Empathy has connected a notify:: signal handler to some property on pulsesrc, and is working on its pa_context in the thread that the signal is emitted from. As Tanu points out, what it should be doing is sending a bus message from the callback and then performing the actual PA calls in the bus handler (which will be in the glib mainloop thread).
Comment 4 Fabrice Bellet 2017-04-11 12:58:43 UTC
Thank you for your time of this case. With your suggestion, I confirm that delaying the work from the notify callback of the gstreamer thread, so that it happens in the glib mainloop thread instead, works for me and makes tsan happy:

> @@ -551,7 +554,13 @@ empathy_mic_monitor_get_current_mic_async (EmpathyMicMonitor *self,
>    operation = operation_new (operation_get_current_mic, simple);
>    g_queue_push_tail (priv->operations, operation);
> 
> -  operations_run (self);
> +  /* we make sure this function containing pulseaudio operations is
> +   * called from the default mainloop, by the same thread than the
> +   * one running the pa_glib_mainloop. This is needed because PA
> +   * is not thread-safe. Other functions calling operations_run()
> +   * are called from the main thread, and don't need this indirection.
> +   */
> +  g_idle_add (operations_run, self);
>  }
> 
>  guint
Comment 5 Arun Raghavan 2017-04-12 07:12:01 UTC
Nice. Are you planning on pushing this upstream to Empathy?
Comment 6 Fabrice Bellet 2017-04-14 08:56:02 UTC
Sure, it's in progress:

https://bugzilla.gnome.org/show_bug.cgi?id=781180


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.