[pulseaudio-discuss] Resampler quality evaluation results
David Henningsson
david.henningsson at canonical.com
Tue Sep 2 01:16:40 PDT 2014
On 2014-08-24 20:53, Alexander E. Patrakov wrote:
> I have finished the first stage of my work on resampler quality evaluation.
>
> The scripts are here: https://gitorious.org/psy-eval/psy-eval/
> The results are here: https://imgur.com/a/jtIEj
>
> Note: they are valid only for 44100 -> 48000 Hz resampling. But that's
> the common case.
>
> TL;DR summary: it makes sense to change the default resampler quality
> from the current "speex-float-1" value to "speex-float-3" or even
> "speex-float-5" on capable machines, otherwise the distortion is
> sometimes noticeable. And, speex-float-{3,5} are similar to what
> proprietary OSes offer.
Hi,
Indeed interesting work, but I have a few concerns to that conclusion...
> The work is based on the question: does a human listener notice the
> distortion introduced by a resampler? To answer that, I used a
> psychoacoustical model publicly available at the following URL:
>
> http://www.mp3-tech.org/programmer/docs/6_Heusdens.pdf
(cut)
> Under that definition, the plots that say "Limited bandwidth counts as
> distortion" below them were made. They display audibility of all
> distortions, as defined above, as a function of the input sine wave
> frequency, for a selection of resamplers. The sine wave is assumed to be
> at the full amplitude, which corresponds (as it is a common convention
> in psychoacoustical models) to 92 dB SPL. Note: do not listen at this
> volume. It is harmful. But it is also the worst case for the
> psychoacoustical model.
I'm trying to understand the diagrams here. It is based on a sine wave
being played at 92 dB SPL, which is too high for the human ear. At that
point, we get distortions of 15 dB (on average) for the trivial
resampler, i e, the distortion or S/N is around -77 dB. Is this correct?
Now consider this:
1) The theoretical limit for the human ear is 0 dB. In practice, it is
more around 10 - 20 dB.
2) As you say, 92 dB is too high for normal listening. Say 80 dB, which
is still louder than one would typically listen to music for longer
periods of time.
3) Now add to that the distortion of normal laptop speakers, headphones
etc. It would be interesting to have that too in the diagram as a reference.
I e, the hearing range becomes 80 - 15 = 65 dB, and the trivial
resampler's distortion is -77 dB.
So given your diagrams, you could just as well argue that one could
switch to the trivial resampler, because you can't hear the distortion
from it anyway. Now I'm not actually saying we should do that, just
saying that maybe we shouldn't jump so quick to the conclusion that we
need to switch to something with higher quality.
(Btw, maybe a log scale for frequency would have been more fair given
how we perceive sounds?)
> Also, audibility of the distortions inherent in a TPDF-dithered 16-bit
> input is shown as "quantization noise" on the same plots. As you see,
> 16-bit input and TPDF dithering do not result in audible distortions.
I also see that speex-float-1 manages to have lower distortion than the
16-bit dithering noise at some frequencies, is this an error in the diagram?
> It's quite sad that the current default in PulseAudio was influenced by
> the needs of low-power embedded devices at the measurable expense of the
> sound quality on the typical desktop. Now, with plots, figures and
> knowledge in hand, we can fix it.
Well, I'm not sure the typical desktop is that typical anymore. Laptops
are more common than desktops, and phones are more common than laptops.
The average user might be more concerned about laptop battery life than
to have resampling without artifacts, if those artifacts that cannot be
heard anyway due to low quality laptop speakers.
So; your conclusion to switch to a higher quality resampler seems to
have a few assumptions about the environment in terms of perfect ears,
equipment, space, power supply and so on. The other extreme is a low-fi
laptop speaker on battery, listened to by an ear with tinnitus, in a
noisy room.
We'll need to end up with a compromise between these two extremes, maybe
somewhere around our current default of speex-float-1, which nobody or
very few people have complaints about (and those who have, are those who
are interested in tuning their system to the highest quality, which
could include switching our default resampler).
--
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic
More information about the pulseaudio-discuss
mailing list