[pulseaudio-discuss] [PATCH v4] Make module loopback honor requested latency

Thu Feb 5 23:17:41 PST 2015

First of all, thanks for a quick and detailed answer.

06.02.2015 02:02, Georg Chini wrote:
> On 05.02.2015 16:59, Alexander E. Patrakov wrote:
>> 01.02.2015 03:43, Georg Chini wrote:
>>> This is the final version of my patch for module-loopback. It is on
>>> top of the
>>> patch I sent about an hour ago and contains a lot more changes than
>>> the previous
>>> versions:
>>>
>>> - Honor specified latency if possible, if not adjust to the lowest
>>> possible value
>>> - Smooth switching from fixed latency to dynamic latency source or
>>> sink and vice versa
>>> - good rate and latency stability, no rate oscillation
>>> - adjusts latency as good as your setup allows
>>> - fast regulation of latency offsets, adjusts 100 ms offset within 22
>>> seconds (adjust
>>>    time=1) to 60 seconds (adjust_time=10)
>>> - usable latency range 4 - 30000 ms
>>> - Avoid rewinds and "cannot peek into queue" messages during startup
>>> and switching
>>> - works with rates between 200 and 190000 Hz
>>> - maximum latency offset after source/sink switch or at startup
>>> around is 200 ms
>>>
>>> I also introduced a new parameter, buffer_latency_msec which can be
>>> used together
>>> with latency_msec. If buffer_latency_msec is specified, the resulting
>>> latency
>>> will be latency_msec + buffer_latency_msec. Latency_msec then refers
>>> only to
>>> the source/sink latency while buffer_latency_msec specifies the
>>> buffer part.
>>> This can be used to save a lot of CPU at low latencies, running 10 ms
>>> latency
>>> with latency_msec=6 buffer_latency_msec=4 gives 8% CPU on my system
>>> compared to
>>> 12% when I only specify latency_msec=10.
>>
>> Is there any more detailed profiling data for it? E.g., have you tried
>> running perf record / perf report?
>>
>
> No, I haven't - I just compared CPU for the current module and my
> patched version. I did not
> change the way audio is processed so I would not expect large
> differences to the current version
> as long as you only use latency_msec.
> CPU usage mainly depends on the latency set for sink and source and
> splitting the latency into
> buffer and source/sink latency aims at that.
> I can provide figures if you would like to see them, but I do not know
> how to use the tools you
> mention above. Is there a HowTo somewhere?

In the kernel source, there is a tools/perf directory. That's where you 
build perf from. It has a Documentation subdirectory, where you can read 
perf-record.txt and perf-report.txt.

You can also read 
https://www.mail-archive.com/pulseaudio-discuss@lists.freedesktop.org/msg11469.html 
, that's how I ran perf on PulseAudio, and some replies in the thread 
contain improvements to my methodology.

>
>> I am not quoting the patch, but my opinion is that it needs either
>> comments or some text in the commit message on the following topics:
>>
>
> Mh, I thought I already put a lot of comments in ...
>
>> Why does it use hand-crafted heuristics instead of a PID-controller?
>> Or, is this "Low pass filtered difference between expectation value
>> and observed latency" a PI controller in disguise?
>>
>
> What do you mean by "handcrafted heuristics"? It's basic math - how many
> samples do I need
> for the requested latency. The equation new_rate = base_rate * (1 +
> latency_difference / adjust_time)
> is a simple P-regulator without I or D part, rate difference is
> proportional to latency difference.

Yes, it is a P regulator (and you address the "heuristics" below in your 
email as "additional restrictions", sorry for not noticing the regulator 
behind them), and your explanation for choosing the coefficient is 
quoted below.

> There were two options - either to get the latency right as quickly as
> possible - that would be
> base_rate * (1 + latency_difference / (adjust_time + latency)) - or to
> get the fastest convergence
> to the latency at the expected rate. I choose the second option.

With latency << adjust_time, I do not treat these two possibilities as 
significantly different options.

The module makes the decision about the new rate once in adjust_time 
seconds. The equation that you quoted (either one) means that you aim 
for the latency difference to be fully corrected just at the next time 
your module makes a decision. I am not 100% sure that it is the optimal 
strategy, especially since in the text below you talk about 
too-sensitive controller and even instability/oscillation.

> I experimented with adding I- or D-parts but did not see any significant
> improvement (maybe
> because of incorrect parameter values).

I will try to look into this on Sunday, then. But this first means 
getting the P-part right.

Here is what I think here, but not yet 100% sure of.

If you use this (obviously wrong) equation:

new_rate = base_rate * (1 + 2 * latency_difference / adjust_time)

then it will overcorrect the latency difference, so that it becomes 
exactly the negatine of what it was before. And so on and so on, i.e. 
there will be oscillations. With values of the factor before 
latency_difference between 1 and 2, overcorrection will also happen, but 
its absolute value will be reduced each time. So, 2 seems to be the 
critical value that leads to instability, and the period of oscillation 
is 2 * adjust_time.

So, according to the Ziegler-Nichols method, for a P-only controller, 
you have indeed chosen the correct value of the P term (i.e. half of 
what would cause self-oscillation). But it is not correct for a general 
PID controller. See 
https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method - for a PID 
controller with no overshoot, the P term would be:

base_rate * (1 + 0.4 * latency_difference / adjust_time)

>
> There are however two additional restrictions:
> 1) Do not deviate too much (I choose 0.75%) from the base rate
> 2) Do the adjustment in small amounts at a time
> The first restriction was solved by introducing min_cycles, which makes
> sure the rate cannot
> deviate more than 0.75% from the base rate. It also produces some
> dampening of the P-regulator
> and smooths out the steps produced by the second restriction (at least
> when you are approaching
> the base rate).
> For 2) I just copied what was already in the current code.

OK, understood.

>
> And then I was looking for the right criterion when  to stop regulation.
> When is the latency
> "good enough"? Using some percentage of the latency will not work well
> when you have a
> latency range of 4 - 30000 ms to cover (at varying rates). Just cutting
> off when you are a few
> Hz away from the base rate like the current code does is also not a good
> idea for the same
> reason (100 - 190000 Hz).
> The best you can get is an error of adjust_time / (2 * base_rate). But
> this does not work either
> because the latency measurement itself has some error and there is some
> jitter in the latency.
> This easily leads to instable rates or oscillation because the
> controller follows every tiny change.

Looks like a complaint about a badly tuned controller, or with the 
additional restrictions interfering with the stability. My own opinion 
is that, with a properly tuned controller, and with restriction (2) 
removed or relaxed, a stop criterion should not be needed at all. I'll 
test the theory on Sunday.

-- 
Alexander E. Patrakov