Bug 66962 - early request mode breaks with high latency sink
Summary: early request mode breaks with high latency sink
Status: RESOLVED FIXED
Alias: None
Product: PulseAudio
Classification: Unclassified
Component: clients (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: pulseaudio-bugs
QA Contact: pulseaudio-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-16 13:54 UTC by Pierre Ossman
Modified: 2015-01-28 08:51 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
early-request-bandaid.patch (2.75 KB, text/plain)
2013-07-16 13:54 UTC, Pierre Ossman
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pierre Ossman 2013-07-16 13:54:03 UTC
Created attachment 82483 [details]
early-request-bandaid.patch

As it is implemented, the early request mode can in some cases be counter-productive. The mode is designed to give the client a steady request/report rate of small-ish chunks[1].

Unfortunately PulseAudio does not have any mechanism for telling a sink/source how often it should request/report data. So a more blunt hack was applied where the entire latency is restricted to the fragment size.

So far so good, but where the current code breaks down is when the sink cannot satisfy this tiny latency request. We then "report" to the client what we can guarantee by setting the fragment size to the sink's/source's full buffer size/latency.

This severely changes the resulting buffer attributes from what the client requested, and in practice breaks applications. The most prominent user of this feature is the ALSA plugin, and it doesn't even have a mechanism of adapting to the server giving back something different than what was requested.


So long term, the whole early request mode needs to be implemented in a better way. Either the sink's/source's need to grow the ability to control request/report rate. Or we put some form of timer based emulation in front of them on behalf of these clients.


Short term, we should change the behaviour of what happens when we cannot guarantee a fragment rate. Instead of giving the client really shitty buffering parameters as a result, we should just keep the requested attributes and do things on a best-effort basic. Basically how things would behave if the client didn't have the early request bit at all.

The attached patch does just that, as well as expand on the comment about how the early request thing is implemented.


[1] A somewhat silly client requirement but at least Flash and Firefox break horribly when you break this.
Comment 1 David Henningsson 2013-11-01 13:39:35 UTC
Hi and sorry for the late review.

Changing client-facing stuff is always scary, because you often end up fixing one application and breaking another.

That said, I think this band-aid could very well improve things in some contexts, so I'm not totally against adding it. Do you have a good test case for it?
Comment 2 Pierre Ossman 2013-11-01 14:35:13 UTC
Set up a pulse server with module_tunnel as the sink and you should be able to reproduce this using video playback in flash. That's how we discovered it.
Comment 3 Raymond 2013-11-04 03:20:16 UTC
if you already know usb audio period time is more than 1ms 

why do alsa-plugin still announce a min period bytes of 128 bytes ?
Comment 4 Pierre Ossman 2013-11-04 07:26:15 UTC
(In reply to comment #3)
> if you already know usb audio period time is more than 1ms 
> 
> why do alsa-plugin still announce a min period bytes of 128 bytes ?

It doesn't know until it is too late to change it. Some clever redesign would be needed to fix this.
Comment 5 David Henningsson 2013-11-12 14:07:41 UTC
So, I think there are two things that's currently keeping me from applying this patch.

1) Understanding. I'm not sure I understand what it is within Flash's way of handling its audio flow that causes the existing implementation to fail. 

2) Fear of regressions. My bad experience tells me that when you fix one scenario you break another. This patch, AFAIK, has had very limited testing across both applications and hardware.
Comment 6 Pierre Ossman 2013-11-12 18:31:57 UTC
(In reply to comment #5)
> So, I think there are two things that's currently keeping me from applying
> this patch.
> 
> 1) Understanding. I'm not sure I understand what it is within Flash's way of
> handling its audio flow that causes the existing implementation to fail. 
> 

The specifics aren't known. It being proprietary isn't exactly helping in analysing the issue.

My guess is that it is either using some kind of timer to handle audio and/or has its own internal buffer. It configures alsa with e.g. 50 ms fragment size. It then expects that every 50 ms it can provide more data and everything is fine. But PulseAudio only requests data e.g. every 400 ms. So after those 50 ms are up, Flash stops buffering and waits for ALSA to wake up. The end result being underruns.

> 2) Fear of regressions. My bad experience tells me that when you fix one
> scenario you break another. This patch, AFAIK, has had very limited testing
> across both applications and hardware.

Always a risk. On the plus side this will only affect a very small number of cases. You have to a) have a client that requests the early request mode, and b) have a sink that cannot configure itself for low latency.

So given that it's mostly corner cases, I don't see how to get any reasonable testing of this without exposing it to the world.
Comment 7 David Henningsson 2013-11-13 08:31:53 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > So, I think there are two things that's currently keeping me from applying
> > this patch.
> > 
> > 1) Understanding. I'm not sure I understand what it is within Flash's way of
> > handling its audio flow that causes the existing implementation to fail. 
> > 
> 
> The specifics aren't known. It being proprietary isn't exactly helping in
> analysing the issue.

Do you also experience this in other open source applications, or is Flash the only one hit by this problem?

Changing buffering behaviour due to one closed source application feels like hunting a moving target. What if Flash decides to something differently next time?

> My guess is that it is either using some kind of timer to handle audio
> and/or has its own internal buffer. It configures alsa with e.g. 50 ms
> fragment size. It then expects that every 50 ms it can provide more data and
> everything is fine. But PulseAudio only requests data e.g. every 400 ms. So
> after those 50 ms are up, Flash stops buffering and waits for ALSA to wake
> up. The end result being underruns.

If those are realistic numbers, it sounds like we more need to do the timing thing in the ALSA plugin, rather than the bandaid you provided.

> > 2) Fear of regressions. My bad experience tells me that when you fix one
> > scenario you break another. This patch, AFAIK, has had very limited testing
> > across both applications and hardware.
> 
> Always a risk. On the plus side this will only affect a very small number of
> cases. You have to a) have a client that requests the early request mode,
> and b) have a sink that cannot configure itself for low latency.

For a), as you know ALSA-plugin is using the early request mode, and there are *a lot* of ALSA application.
For b), I think bluetooth a2dp would be a reasonably common high latency sink scenario. 

> So given that it's mostly corner cases, I don't see how to get any
> reasonable testing of this without exposing it to the world.

See above.
Comment 8 Pierre Ossman 2013-11-13 16:32:37 UTC
(In reply to comment #7)
> 
> Do you also experience this in other open source applications, or is Flash
> the only one hit by this problem?
> 

I think that Firefox' built in video player also suffered from the issue.

Unfortunately this was not the only bug we were struggling with, so we don't have a full picture of which applications were affected by which issues. Given the nature of the bug though, it should be limited to programs relying on the ALSA plugin.

> Changing buffering behaviour due to one closed source application feels like
> hunting a moving target. What if Flash decides to something differently next
> time?

Always a risk. But this change is not just about satisfying the whims of Flash. It does make PulseAudio better in a general sense, at least for the ALSA emulation. The current ALSA plugin does not fully satisfy the ALSA API from the applications point of view. Flash just turned out to be more sensitive to violations of the ALSA API than most.

IOW, I don't believe this to be a workaround for a bug in Flash. This is a workaround for a limited architecture of the ALSA PulseAudio plugin. The point is to minimise its broken behaviour.

A better fix is most likely possible. But not without a lot of work. So this bandaid is low hanging fruit IMO. And proprietary or not, Flash is a very common application. I don't find it acceptable to not have it working.

> 
> > My guess is that it is either using some kind of timer to handle audio
> > and/or has its own internal buffer. It configures alsa with e.g. 50 ms
> > fragment size. It then expects that every 50 ms it can provide more data and
> > everything is fine. But PulseAudio only requests data e.g. every 400 ms. So
> > after those 50 ms are up, Flash stops buffering and waits for ALSA to wake
> > up. The end result being underruns.
> 
> If those are realistic numbers, it sounds like we more need to do the timing
> thing in the ALSA plugin, rather than the bandaid you provided.
> 

Indeed. But it's a lot more work and not as easily solved.

> > Always a risk. On the plus side this will only affect a very small number of
> > cases. You have to a) have a client that requests the early request mode,
> > and b) have a sink that cannot configure itself for low latency.
> 
> For a), as you know ALSA-plugin is using the early request mode, and there
> are *a lot* of ALSA application.
> For b), I think bluetooth a2dp would be a reasonably common high latency
> sink scenario. 
> 
> > So given that it's mostly corner cases, I don't see how to get any
> > reasonable testing of this without exposing it to the world.
> 
> See above.

Fair enough. Bluetooth users might be in the risk of regressions. Should be simple enough to test if you have a A2DP device available?
Comment 9 David Henningsson 2013-11-15 13:06:32 UTC
As a side note, there is also bug https://bugs.freedesktop.org/show_bug.cgi?id=71204 which is related to the ALSA plugin layer and flash. I wonder if these two bugs are related?
Comment 10 Maarten Lankhorst 2014-03-25 08:53:51 UTC
This problem has become a lot worse now that tsched is disabled on USB devices because of commit 826c8f69d34ef49 "alsa: Disable timer-scheduling for PCMs with the BATCH flag". Winepulse works similarly to alsa and also needs early request mode to function properly. Most applications cannot handle a pointer granularity of > 50 ms, especially when they use winepulse indirectly through dsound.
Comment 11 Raymond 2014-03-25 11:40:24 UTC
> 
> My guess is that it is either using some kind of timer to handle audio
> and/or has its own internal buffer. It configures alsa with e.g. 50 ms
> fragment size. It then expects that every 50 ms it can provide more data and
> everything is fine. But PulseAudio only requests data e.g. every 400 ms. So
> after those 50 ms are up, 

50ms is still too long if you are playing video with 24 frames per second,
Comment 12 David Henningsson 2014-03-28 08:06:55 UTC
(In reply to comment #10)
> This problem has become a lot worse now that tsched is disabled on USB
> devices because of commit 826c8f69d34ef49 "alsa: Disable timer-scheduling
> for PCMs with the BATCH flag". Winepulse works similarly to alsa and also
> needs early request mode to function properly. Most applications cannot
> handle a pointer granularity of > 50 ms, especially when they use winepulse
> indirectly through dsound.

Could you elaborate on why disabling tsched makes this problem worse? Without tsched we would default to 4 periods of 25 ms each, so you should get reports every 25 ms...right?

And btw, in Ubuntu it was changed (before my time) to 8 periods of 10 ms each.
Comment 13 David Henningsson 2014-05-23 12:48:06 UTC
While I would have liked to see more testing of this patch, that does not seem to happen and it seems to solve at least Pierre's problem.

So I applied it in the hope that it will lead to more testing of the patch. Hopefully it won't regress other stuff, but who knows...


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.