[pulseaudio-discuss] [PATCH 04/13] loopback: Adjust rates based on latency difference

Wed Nov 11 11:30:22 PST 2015

11.11.2015 23:36, Tanu Kaskinen wrote:
> Sorry for being obtuse, but I don't follow what this simple bit of code
> is doing. You mentioned "P-controller" and the "Ziegler-Nichols
> method". I followed the Wikipedia link, and found that a P-controller
> is a very simple thing:

Actually it was me when splitting the patch :)

>
> u(t) = Kp * e(t)
>
> where
>
> u(t): the new control variable value (the new sink input rate)

No. The control variable is new_rate - base_rate.

>
> Kp: a tunable parameter (a magic number)
>
> e(t): the error value, i.e. the difference between the current process
> variable value and the target value (current latency minus configured
> latency)

Correct.

> The Ziegler-Nichols method can be used to choose Kp. For a P-controller
> Kp is defined as
>
> Kp = 0.5 * Ku
>
> where
>
> Ku: a number that, when used in place of Kp, makes u(t) oscillate in a
> stable manner

See below, I'll comment on that.

>
> (A sidenote: I probably have understood something wrong, because Kp is
> a plain number, and u(t) and e(t) have different units, so there
> appears to be a unit mismatch. u(t) is a frequency and e(t) is a time
> amount.)

Kp is not a plain number. It has the unit necessary to convert from the 
unit of error value to the unit of the control variable.

>
> Figuring out Ku seems to require having an initial calibration phase
> where various Ku values are tried and the oscillation of u(t) is
> measured. The code doesn't seem to do this. Could you explain how you
> have derived the formula in rate_controller()?
>

The formula is indeed not the most obvious one. We have exchanged some 
emails with Georg. If he permits, I can forward his email with the 
derivation of the non-linear part written on paper and scanned. But, to 
answer the question about the "optimal tuning" in the sense of 
Ziegler-Nichols method, we only need to talk about the linear 
approximation in latency_difference_usec, that is, put min_cycles to 1.0.

So:

new_rate = base_rate * (1.0 + latency_difference_usec / adjust_time)

I.e. here Kp = 1 / adjust_time, that's all.

Assuming that the correct rate is the nominal one (i.e. base_rate), 
which is a crude approximation but good enough for evaluating stability, 
the latency difference accumulates with the speed which is exactly 
(base_rate - new_rate) / base_rate. Indeed, in one second according to 
the input, base_rate samples will be pushed, but only new_rate samples 
will be pulled from the queue. So, each second, the queue grows by 
base_rate - new_rate samples. According to base_rate, it's (base_rate - 
new_rate) / base_rate seconds per second.

Now note that the new latency difference will be evaluated again in 
adjust_time. So, if we put Kp = 2 / adjust_time instead of what we did, 
then see what happens: by the time we look again, the latency difference 
will be overcorrected by a factor of 2. I.e. changes the sign. Then the 
rate controller will try to correct that again, and will again overshoot 
by a factor of 2, i.e. it will return to the original value. I.e. it 
will exhibit oscillations with constant amplitude - exactly what 
Ziegler-Nichols method calls for, when calibrating. We actually use Kp = 
1 / adjust_time, i.e. half of the critical value, which is exactly what 
Ziegler-Nichols method prescribes.

Note: I did not say that following this method is good for our purposes. 
The PID controller recommended in these papers (and used in Jack) is not 
optimal in the sense of Ziegler-Nichols method:

http://kokkinizita.linuxaudio.org/papers/usingdll.pdf
http://kokkinizita.linuxaudio.org/papers/adapt-resamp.pdf

-- 
Alexander E. Patrakov