[pulseaudio-discuss] [PATCH 0/4] Add support for libsoxr resampler

Sun Nov 16 07:42:02 PST 2014

On Friday 14 November 2014 11:37:07 you wrote:
> On Friday 14 November 2014 08:26:18 David Henningsson wrote:
> > On 2014-11-13 23:49, Andrey Semashev wrote:
> > > I do not have an explanation for such diverse range of the delay value,
> > > and
> > > its dependency on the frame size. It doesn't look like the filter is
> > > "learning" from the input in some way since the delay doesn't depend on
> > > the
> > > content. Perhaps there is some extensive buffering in the
> > > implementation.
> > 
> > Well, the delay must be constant given the parameters. If the delay was
> > varying during playback, that would probably cause very interesting
> > sound effects, such as music being slightly out of tempo or so...
> 
> I've been using soxr-vhq with PA 4.0 on my working machine for about a month
> now, and never heard any sound artefacts.

I ran the test with printing the delay on each audio chunk. With constant 
frame size the delay remains pretty much constant. In a ~3.5 minute 44.1 kHz 
input content fed in 20 ms frames there was one (65-th, 1300 ms from the 
start) output chunk of 953 samples (instead of the usual 960), which increased 
the delay from 20.000 ms to 20.146 ms. It stayed constant until the very end, 
when the resampler was flushed. The target sample rate was 48 kHz.

> > What does vary during playback, however, is how big chunks we pass into
> > the resampler in every go. Which begs the question if it is the first
> > chunk that determines the delay, or...?
> 
> So PA uses variable frame size? I can try to modify the test for that. Are
> there any reasonable limits of the frame size?

With variable frame size (I tried random values between 20 and 100 ms) the 
delay also varies a lot and can be anywhere between 3 and 20 ms. The resampler 
fills or flushes its buffers as it sees fit, in response to the input frame 
size changes. There doesn't seem to be any particular pattern on the delay 
variation (i.e. it looks as random as the input frame size is).

The resulting audio did not have any artefacts or audible quality degradation. 
I believe, this is expected in a real-time application as well, since you 
would expect the audio data rate to be more or less constant, even if 
packetized in randomly sized frames. At the output of the resampler you would 
get frames as randomly sized as the resampler input, so if PA currently 
handles variable frame size without artefacts, it should handle soxr as well. 

The artefacts you described could probably appear in case of large jitter, 
which exceeds PA or sound card buffering capacity. Soxr itself does not 
increase jitter if input signal jitter is higher than the resampler delay 
jitter. In other words, if PA uses variable frame size, and input jitter is 
less than ~20 ms, the resampled signal jitter can be increased. How much the 
increase would be is difficult to say, but I think the upper limit is the 
range of the resampler delay variations.

I ran a few more tests to see how much the delay changes depending on the 
frame size variations:

 - frame sizes 20-25 ms -> delay 7-20 ms
 - frame sizes 20-30 ms -> delay 6-20 ms
 - frame sizes 20-40 ms -> delay 3-20 ms
 - frame sizes 20-50 ms -> delay 3-20 ms

So it seems, variable frame sizes can increase jitter to ~17 ms (if the input 
signal has lower jitter). That, of course, corresponds to the parameters I 
used in my test (44.1->48kHz).

As previously, the test code is in git:

https://github.com/Lastique/src_test

You will have to modify code (min_frame_duration and max_frame_duration 
constants) and recompile to see the results for different range of frame 
sizes.

PS: All the above tests were performed with soxr-vhq, as the one that has 
largest delay values.