Bug 81075 - pa_simple_get_latency() reports latency inaccurately
Summary: pa_simple_get_latency() reports latency inaccurately
Status: RESOLVED FIXED
Alias: None
Product: PulseAudio
Classification: Unclassified
Component: misc (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: pulseaudio-bugs
QA Contact: pulseaudio-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-09 03:19 UTC by michaelni
Modified: 2017-07-20 14:01 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
fix (1.68 KB, patch)
2014-07-09 09:04 UTC, Tanu Kaskinen
Details | Splinter Review
Test program (1.76 KB, text/plain)
2014-07-13 14:29 UTC, Tanu Kaskinen
Details
Output from the test program, before patching (9.22 KB, text/plain)
2014-07-13 14:30 UTC, Tanu Kaskinen
Details
Output from the test program, after patching (9.64 KB, text/plain)
2014-07-13 14:31 UTC, Tanu Kaskinen
Details
Graph for before.csv (9.21 KB, image/png)
2014-07-13 14:35 UTC, Tanu Kaskinen
Details
Graph for after.csv (9.58 KB, image/png)
2014-07-13 14:36 UTC, Tanu Kaskinen
Details
Change latency estimation to account for already-read data in pa_simple_get_latency(). (957 bytes, patch)
2017-06-01 22:30 UTC, Ted
Details | Splinter Review
Test program output after patch 102473 and before patch 131661 (8.69 KB, text/csv)
2017-06-01 22:32 UTC, Ted
Details
Test program output after both patch 102473 and patch 131661 (8.60 KB, text/csv)
2017-06-01 22:33 UTC, Ted
Details
Output graph after patch 102473 and before patch 131661 (26.66 KB, image/png)
2017-06-01 22:34 UTC, Ted
Details
Output graph after both patch 102473 and patch 131661 (18.19 KB, image/png)
2017-06-01 22:35 UTC, Ted
Details
"Fix" integer overflow and negative latency issues and add comments (seems to actually make things worse) (2.47 KB, patch)
2017-06-17 16:42 UTC, Tanu Kaskinen
Details | Splinter Review

Description michaelni 2014-07-09 03:19:20 UTC
When trying to assign wallclock timestamps to recorded audio from the simple API, I noticed that latencies from pa_simple_get_latency() seemed behaving more oddly than expected. The issue was solved/worked around by using the "not simple" API.
From looking at simple.c it seems that the p->read_data / p->read_length buffer used in pa_simple_read() isnt considered in pa_simple_get_latency(). Which would explain the issue ive seen with latencies
Comment 1 Tanu Kaskinen 2014-07-09 09:04:42 UTC
Created attachment 102473 [details] [review]
fix

Your analysis seems correct. I attached a fix. It's untested - could you test it?
Comment 2 michaelni 2014-07-10 03:00:56 UTC
Ive done a quick test by patching the old pulseaudio (1:1.1-0ubuntu15.4) thats used on that ubuntu box. And it does not seem to fix it completely but i wouldnt be surprised if that old version had other issues. I know i should test the patch with latest pulse but for that i have too many other things to do ATM.
But its really easy to test this, you just need to run pa_simple_read(), pa_simple_get_latency() and something like gettimeofday() in a loop and check if the values appear matching up, that is big jumps in the wallclock should matchup with big jumps in the latency values, and the other way around too
Comment 3 Tanu Kaskinen 2014-07-13 14:29:35 UTC
Created attachment 102705 [details]
Test program

I wrote a test program, it's attached.
Comment 4 Tanu Kaskinen 2014-07-13 14:30:25 UTC
Created attachment 102706 [details]
Output from the test program, before patching
Comment 5 Tanu Kaskinen 2014-07-13 14:31:01 UTC
Created attachment 102707 [details]
Output from the test program, after patching
Comment 6 Tanu Kaskinen 2014-07-13 14:35:48 UTC
Created attachment 102708 [details]
Graph for before.csv
Comment 7 Tanu Kaskinen 2014-07-13 14:36:38 UTC
Created attachment 102709 [details]
Graph for after.csv
Comment 8 Tanu Kaskinen 2014-07-13 14:46:22 UTC
I attached before/after output from the test program (both were about one minute runs), as well as graphs generated from those outputs.

What does the test program do, and how to interpret the output files? The program calls pa_simple_read() in an infinite loop, reading 10000 bytes at a time. After each read, the program prints "x,y" to stdout, where x is usecs since the first read (wall clock time) and y is the difference between x and the cumulative amount of audio that has been read plus the current latency as reported by pa_simple_get_latency(). I believe that if the latency reporting is perfect, y should be constant (or change slowly and steadily if there's drift between the wall clock and the sound card clock).

The patch seems to improve things, but the graph after patching is still far from a straight line. I don't have time to dig in deeper at this time, so I think it's best if I apply the patch to master and leave this bug open.
Comment 9 Ted 2017-06-01 22:30:16 UTC
Created attachment 131661 [details] [review]
Change latency estimation to account for already-read data in pa_simple_get_latency().

Small change to the previous patch to account for the fact that read_index + read_length may change over time, to make the latency estimate more consistent.
Comment 10 Ted 2017-06-01 22:32:15 UTC
Created attachment 131662 [details]
Test program output after patch 102473 and before patch 131661
Comment 11 Ted 2017-06-01 22:33:03 UTC
Created attachment 131663 [details]
Test program output after both patch 102473 and patch 131661
Comment 12 Ted 2017-06-01 22:34:27 UTC
Created attachment 131664 [details]
Output graph after patch 102473 and before patch 131661

Ran for 60 seconds on e3b64d8fd394fdd0b70dfd14a4a9b90c27369a99
Comment 13 Ted 2017-06-01 22:35:22 UTC
Created attachment 131665 [details]
Output graph after both patch 102473 and patch 131661

Ran for 60 seconds
Comment 14 Ted 2017-06-01 22:46:22 UTC
Hi,
I noticed that pa_simple_get_latency() was reporting that the audio was being recorded out of order. I've attached a patch that makes it report (approximately) monotonically increasing recording time.
Comment 15 Tanu Kaskinen 2017-06-17 16:37:51 UTC
Awesome, thanks for the fix! I applied it to the "next" branch, so it won't be in the next release (we're in the freeze phase), but it will be in PA 12.0.

There's a problem with integer overflowing, though. pa_usec_t is an unsigned type, so I don't think it's a good idea to store negative values in it (even if it seems to work pretty well). I tried to fix it, but when I tested my fix, the results were worse. My fix included changing the code so that if the latency is negative, pa_simple_get_latency() returns zero (because it doesn't have a mechanism for reporting negative latencies), and it looks like pa_stream_get_latency() is reporting negative latencies even when it should have no reason to do so, and I'm trying to figure what's going on.
Comment 16 Tanu Kaskinen 2017-06-17 16:42:55 UTC
Created attachment 132024 [details] [review]
"Fix" integer overflow and negative latency issues and add comments (seems to actually make things worse)

Here's the fixup patch I currently have.
Comment 17 Ted 2017-06-17 17:25:22 UTC
Thanks. I suppose the right way to handle negative latency would be to use the full API instead of the simple API?
Comment 18 Tanu Kaskinen 2017-06-17 18:04:53 UTC
With the full API you can get the negative latencies in your application, yes, but I'm not sure if that's any more "right", unless you're recording from monitor sources that really can have negative latencies (negative latency means "seeing in the future", and with monitor sources that's possible, because you may sometimes get the audio to your application before the sound goes to the speakers).

If you're not using a monitor source, a negative latency means that there's a bug in PulseAudio. I think my patch is ok, even though changing negative latencies to zero makes the latency calculations jumpy. The root cause seems to be in pa_stream_get_latency(), more specifically in the smoother that estimates how much audio has been played in total based on the current time and the past timing information updates from the server. The smoother is consistently estimating the played amount to be too low, and that makes the calculated latency negative.
Comment 19 Tanu Kaskinen 2017-07-11 18:23:48 UTC
I found a bug at the server end of the native protocol implementation. Now the latency reports are rather stable, at least in my experiments.

I submitted the native protocol fix here:
https://patchwork.freedesktop.org/patch/166232/

I also submitted the patch regarding negative latency reporting in pa_simple_get_latency() that I already attached to this bug. Here's the submission:
https://patchwork.freedesktop.org/patch/166236/
Comment 20 Tanu Kaskinen 2017-07-20 14:01:34 UTC
All patches are now in master, so I'm closing this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.