[pulseaudio-discuss] [PATCH 07/13] loopback: Refactor latency initialization

Thu Nov 26 09:47:34 PST 2015

On Thu, 2015-11-26 at 08:41 +0100, Georg Chini wrote:
> On 26.11.2015 01:49, Tanu Kaskinen wrote:
> > On Wed, 2015-11-25 at 22:58 +0100, Georg Chini wrote:
> > > On 25.11.2015 19:49, Tanu Kaskinen wrote:
> > > > On Wed, 2015-11-25 at 16:05 +0100, Georg Chini wrote:
> > > > > On 25.11.2015 09:00, Georg Chini wrote:
> > > > > > OK, understood. Strange that you are talking of 75% and 25%
> > > > > > average buffer fills. Doesn't that give a hint towards the connection
> > > > > > between sink latency and buffer_latency?
> > > > > > I believe I found something in the sink or alsa code back in February
> > > > > > which at least supported my choice of the 0.75, but I have to admit
> > > > > > that I can't find it anymore.
> > > > > Lets take the case I mentioned in my last mail. I have requested
> > > > > 20 ms for the sink/source latency and 5 ms for the memblockq.
> > > > What does it mean that you request 20 ms "sink/source latency"? There
> > > > is the sink latency and the source latency. Does 20 ms "sink/source
> > > > latency" mean that you want to give 10 ms to the sink and 10 ms to the
> > > > source? Or 20 ms to both?
> > > I try to configure source and sink to the same latency, so when I
> > > say source/sink latency = 20 ms I mean that I configure both to
> > > 20 ms.
> > > In the end it may be possible that they are configured to different
> > > latencies (for example HDA -> USB).
> > > The minimum necessary buffer_latency is determined by the larger
> > > of the two.
> > > For simplicity in this thread I always assume they are both equal.
> > > 
> > > > > The
> > > > > 20 ms cannot be satisfied, I get 25 ms as sink/source latency when
> > > > > I try to configure it (USB device).
> > > > I don't understand how you get 25 ms. default_fragment_size was 5 ms
> > > > and default_fragments was 4, multiply those and you get 20 ms.
> > > You are right. The configured latency is 20 ms but in fact I am seeing
> > > up to 25 ms.
> > 25 ms reported as the sink latency? If the buffer size is 20 ms, then
> > that would mean that there's 5 ms buffered later in the audio path.
> > That sounds a bit high to me, but not impossible. My understanding is
> > that USB transfers audio in 1 ms packets, so there has to be at least 1
> > ms extra buffer after the basic alsa ringbuffer, maybe the extra buffer
> > contains several packets.
> 
> I did not check the exact value, maybe it is not 25 but 24 ms, anyway
> significantly larger than the configured value.
> 
> > 
> > > > > For the loopback code it means that the target latency is not what
> > > > > I specified on the command line but the average sum of source and
> > > > > sink latency + buffer_latency.
> > > > The target latency should be "configured source latency +
> > > > buffer_latency + configured sink latency". The average latency of the
> > > > sink and source don't matter, because you need to be prepared for the
> > > > worst case scenario, in which the source buffer is full and the sink
> > > > wants to refill its buffer before the source pushes its buffered audio
> > > > to the memblockq.
> > > Using your suggestion would again drastically reduce the possible
> > > lower limit. Obviously it is not necessary to go to the full range.
> > How is that obviously not necessary? For an interrupt-driven alsa
> > source I see how that is not necessary, hence the suggestion for
> > optimization, but other than that, I don't see the obvious reason.
> 
> Obviously in the sense that it is working not only for interrupt-driven
> alsa sources but also for bluetooth devices and timer-based alsa devices.
> I really spent a lot of time with stability tests, so I know it is working
> reliable for the devices I could test.
> 
> > 
> > > That special case is also difficult to explain. There are two situations,
> > > where I use the average sum of source and sink latency.
> > > 1) The latency specified cannot be satisfied
> > > 2) sink/source latency and buffer_latency are both specified
> > > 
> > > In case 1) the sink/source latency will be set as low as possible
> > > and buffer_latency will be derived from the sink/source latency
> > > using my safeguards.
> > > in case 2) sink/source latency will be set to the nearest possible
> > > value (but may be higher than specified), and buffer_latency is
> > > set to the commandline value.
> > > 
> > > Now in both cases you have sink/source latency + buffer_latency
> > > as the target value for the controller - at least if you want to handle
> > > it similar to the normal operation.
> > > The problem now is that the configured sink/source latency is
> > > possibly different from what you get on average. So I replaced
> > > sink/source latency with the average sum of the measured
> > > latencies.
> > Of course the average measured latency of a sink or source is lower
> > than the configured latency. The configured latency represents the
> > situation where the sink or source buffer is full, and the buffers
> > won't be full most of the time. That doesn't mean that the total
> > latency doesn't need to be big enough to contain both of the configured
> > latencies, because you need to handle the case where both buffers
> > happen to be full at the same time.
> 
> I am not using sink or source latency alone, I am using the
> average sum of source and sink latency, which is normally
> slightly higher than a single configured latency.
> 
> How can it be possible that both buffers are full at the same
> time? This could only happen if there is some congestion and
> then there is a problem with the audio anyway. In a steady
> state, when one buffer is mostly empty, the other one must be
> mostly full. Otherwise the latency would jump around wildly.

There are three buffers, not two. The sum of the buffer fill level of
the sink and source will jump around wildly, but the total latency will
stay constant, because the memblockq will always contain the empty
space of the sink and source buffers.

Note that the measured latency of the sink and source doesn't jump
around that wildly, because the measurement often causes a reset in the
buffer fill levels. But every time the sink buffer is refilled or the
source pushes out data, the latency of the sink or source jumps. The
sink refills and source emptying don't (generally) happen in a
synchronized manner, so the latency sum of the sink and source does
jump around.

If the sink and source were synchronized, the combined latency wouldn't
jump around, and you could reduce the total latency. But that's not the
case.

> > > The average is also used to compare the "real"
> > > source/sink latency + buffer_latency
> > > against the configured overall latency and the larger of the two
> > > values is the controller target. This is the mechanism used
> > > to increase the overall latency in case of underruns.
> > I don't understand this paragraph. I thought the reason why the
> > measured total latency is compared against the configured total latency
> > is that you then know whether you should increase or decrease the sink
> > input rate. I don't see how averaging the measurements helps here.
> 
> Normally, the configured overall latency is used as a target for
> the controller. Now there must be some way to detect during
> runtime if this target is something that can be achieved at all.

You know whether the target is achievable when you know the maximum
latency of the sink and source. If the sum of those is larger than the
target, the target is not achievable. Measuring the average latency
doesn't bring any new information.

By the way, the fact that the real sink latency can be higher than the
configured latency is problematic, when thinking whether the target
latency can be achieved. The extra latency needs to be compensated by
decreasing buffer_latency, and if such extra margin doesn't exist, then
the target latency is not achievable.

It's not currently possible to separate the alsa ringbuffer latency
from the total sink latency, so you don't know how much there is such
extra latency. You could look at the sink latency reports, and if they
go beyond the configured latency, then you know that there's *at least*
that much extra latency, but it would be nice if the alsa sink could
report the ringbuffer and total latencies separately. The alsa API
supports this, but PulseAudio's own APIs don't. The source has the same
problem, but it also has the additional problem that the latency
measurements cause the buffer to be emptied first, so the measurements
never show larger latencies than configured. I propose that for now we
ignore such extra latencies. We could add some safety margin to
buffer_latency to cover these latencies, but it's not nice to force
that for cases where such extra latencies don't exist.

> So I compare the target value against buffer_latency +
> average_sum_of_source_and_sink_latency and set the controller
> target to the larger of the two.
> This is the way the underrun protection works. In normal operation,
> the configured overall latency is larger than the sum above and
> buffer_latency is not used at all. When underruns occur, buffer_latency
> is increased until the sum gets larger than the configured latency
> and the controller switches the target.
> 
> > 
> > And what does this have to do with increasing the latency on underruns?
> > If you get an underrun, then you know buffer_latency is too low, so you
> > bump it up by 5 ms (if I recall your earlier email correctly), causing
> > the configured total latency to go up by 5 ms as well. As far as I can
> > see, the measured latency is not needed for anything in this operation.
> > 
> > ----
> > 
> > Using your example (usb sound card with 4 * 5 ms sink and source
> > buffers), my algorithm combined with the alsa source optimization
> > yields the following results:
> > 
> > configured sink latency = 20 ms
> > configured source latency = 20 ms
> > maximum source buffer fill level = 5 ms
> > buffer_latency = 0 ms
> > target latency = 25 ms
> > 
> > So you see that the results aren't necessarily overly conservative.
> 
> That's different from what you proposed above, but sounds
> like a reasonable approach. The calculation would be slightly
> different because I defined buffer_latency = 5 ms on the
> command line. So the result would be 30 ms, which is more
> sensible. First we already know that the 25 ms won't work.
> Second, the goal of the calculation was to find a working
> target latency using the configured buffer_latency, so you
> can't ignore it.
> My calculation leads to around 27.5 ms instead of your 30 ms,
> so the two values are near enough to each other and your
> proposal has the advantage of being constant.
> 
> I will replace the average sum by
> 0.25 * configured_source_latency + configured_sink_latency.
> in the next version if my tests with that value are successful.

Do you mean that you're going to use 0.25 as the multiplier regardless
of the number of fragments?

Previously I've been saying that in the general case the target latency
should be "configured source latency + buffer_latency + configured sink
latency". To generalize the alsa source exception, I'll use the
following definition instead from now on: "target latency = maximum
source buffer fill level + buffer_latency + maximum sink buffer fill
level". Usually the maximum fill levels have to be assumed to be the
same as the configured latencies, but in the interrupt-driven alsa
source case the maximum fill level is known to be "configured source
latency / fragments".

> I'll keep track of that sum anyway, just to ensure that it is not
> larger than the value above.
> Using your value also solves another problem which always
> worried me a bit: The average sum is re-calculated on each
> adjust_time, so the controller target is moving in that case.
> 
> > 
> > buffer_latency shouldn't be zero, of course, if you want to protect
> > against rate errors, scheduling delays and jitter[1], but my point is
> > that buffer_latency shouldn't be proportional to the fragment size
> > (unless you can show how the rate errors, scheduling delays or jitter
> > are proportional to the fragment size).
> 
> Maybe my explanations above clarify the role of buffer_latency a bit.
> What you are saying now is that buffer_latency only needs to be
> large enough to account for the jitter.

Well, I'm saying (and have meant that all the time) that buffer_latency
only needs to be large enough to account for the latency measurement
jitter and other things that can cause the memblockq run empty: rate
errors, scheduling delays or whatever else. Or do you define "jitter"
to mean all of these issues?

> This is a contradiction to
> what you said earlier. I am not really sure where this discussion
> is leading to. We are also mixing up different topics at the moment.
> The first one is a matter of the safeguards. As already said in a
> previous mail, in my opinion those safeguards only have to cover
> the most common cases and do not need to be perfect because
> the controller will take care at runtime.
> The second topic is the usage of the average source/sink latency
> in the controller, which is a runtime and not a startup topic. But
> if you can agree to my calculation above, I consider this settled.

I don't agree that 0.25 should be used to figure out the maximum buffer
fill level of a source, unless it's an interrupt-driven alsa source
with 4 fragments.

-- 
Tanu