[pulseaudio-discuss] [PATCH 0/4] Add support for libsoxr resampler

David Henningsson david.henningsson at canonical.com
Thu Nov 13 23:26:18 PST 2014



On 2014-11-13 23:49, Andrey Semashev wrote:
> On Thursday 13 November 2014 11:32:09 you wrote:
>> On Thu, Nov 13, 2014 at 8:33 AM, David Henningsson
>>
>> <david.henningsson at canonical.com> wrote:
>>> On 2014-11-11 22:39, Andrey Semashev wrote:
>>>> In short, libsoxr is almost always faster than speex, and introduces much
>>>> less distortions. Its passband frequency is slightly lower than speex
>>>> though, and it can add a delay up to 20 ms in some cases.
>>>
>>> I'm interested in knowing more about the delay. What are "some cases"?
>>
>> "Some cases" means some sample rate combinations. In my tests I
>> measured the delay of the resampler, and it was ~20 ms max. I don't
>> have the results accessible now, I'll add them tonight to the results
>> page.
>
> Ok, here are some interesting results.

Cool, thanks for the testing!

> 1. The delay does not depend on the input format (int16 vs float) or content.
> I tried with two different input pieces of content.
>
> 2. The delay _does_ depend on the input frame size (i.e. the amount of input
> samples you pass to the resampler in one chunk). I tested for frame sizes of
> 20 and 100 samples per channel. There isn't a particularly obvious relation
> between the frame size and the delay.
>
> 3. The delay is typically lower for low quality presets (-lq, -mq), but that's
> not always the case.
>
> 4. The delay is typically lower for 2-fold sample rate conversions (i.e. 48kHz
> <-> 96kHz).
>
> 5. The delay varies in a wide range between different sample rate
> combinations. Different quality presets, on the other hand, are not as
> different. There are cases of 20 ms delay on all three -mq, -hq and -vhq
> presets, as well as there are cases of <5 ms.
>
> 6. With frame size 20 min/max delay values are:
>
>     -mq: 1.917/20.604 ms
>     -hq: 2.708/20.000 ms
>     -vhq: 4.208/20.000 ms
>
>    In case of 44.1 kHz, 16 bit, int -> 48 kHz it is 20.604/20.000/20.000 ms in
> the three presets.
>
> 7. With frame size 100 min/max delay values are:
>
>     -mq: 2.771/12.336 ms
>     -hq: 7.104/16.531 ms
>     -vhq: 5.250/27.256 ms
>
>    In case of 44.1 kHz, 16 bit, int -> 48 kHz it is 2.771/7.104/15.292 ms in
> the three presets.
>
>    Note that 27.256 for -vhq in this case is larger than I stated in my initial
> announcement and the docs patch. I had not tested frame size 100 at that time.
> I will update the docs patch accordingly (probably, by describing the delay
> range more loosely).
>
> I do not have an explanation for such diverse range of the delay value, and
> its dependency on the frame size. It doesn't look like the filter is
> "learning" from the input in some way since the delay doesn't depend on the
> content. Perhaps there is some extensive buffering in the implementation.

Well, the delay must be constant given the parameters. If the delay was 
varying during playback, that would probably cause very interesting 
sound effects, such as music being slightly out of tempo or so...

What does vary during playback, however, is how big chunks we pass into 
the resampler in every go. Which begs the question if it is the first 
chunk that determines the delay, or...?

> I can try and perform more tests with different frame sizes in attempt to
> determine the approximate maximum delay. I suspect, even after such testing is
> conducted I won't be absolutely sure that the discovered upper value won't
> ever be exceeded in some other case I did not cover. I can, however, test with
> the frame size that is used in PulseAudio, if such fixed or typical value
> exists (does it?).
>
> For now the bottom line is that the exact delay of the resampler is difficult
> to predict, although it usually does not exceed 20 ms, except some rare cases
> and -vhq. When delay is critical it is better to use another resampler, like
> speex-5, for instance, which consistently stays below 1 ms across the board.
> But I think, such cases are quite specialized, and soxr is still very well
> applicable in general use.

Well, what is "quite specialized" and "general use"? If you use your 
computer primarily for gaming and VOIP, then that's what you consider 
"general use", and perhaps "listening to music so carefully that you 
hear the difference between different resamplers" is what you consider 
"quite specialized"...

So if it was up to me, I'd say let's keep speex-float-1 as the default, 
as it seems to give the best balance between quality, CPU power, and low 
latency.

With my upstream hat on, I don't mind adding soxr as an option, and with 
my distro hat on, I'm always worried about adding new dependencies...

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic


More information about the pulseaudio-discuss mailing list