[pulseaudio-discuss] Resampler quality evaluation: now on music files
Alexander E. Patrakov
patrakov at gmail.com
Sat Oct 4 22:48:20 PDT 2014
[tl;dr: speex-float-1 is adequate for 44100 -> 48000 Hz resampling,
ffmpeg also is, speex-float-0 isn't]
Previously, I have posted some quality-evaluation results for resamplers
that can be used by PulseAudio:
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-August/021362.html
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-September/021811.html
The main objections were:
1. Unbearably loud (92 dB SPL) sound from speakers or headphones. People
don't listen at such levels. At lower levels, the distortions also have
lower sound pressure, and may become unnoticeable.
2. Absolutely quiet room (except for this sound and resampler
distortions). In a noisy room, noise can mask ("outvoice") the distortion.
3. Perfect speakers or headphones that don't distort sounds at all by
themselves. Maybe headphone distortions can mask resampler distortions?
4. Sine wave (and not music or speech) as a test sound to be distorted
by a resampler. Maybe other frequency components can mask resampler
distortions?
As it turns out, (4) is a very valid point. The most valid point of all
four. In fact, in the vast majority of music files, the extra components
of the signal are strong enough to mask the distortions of speex-float-1
even without taking other points into account. Still, I have a script
that takes (1), (2) and (4) into account, and you can run it on your own
music files. As I already explained in the previous email, there is no
plan to account for (3).
git clone git://gitorious.org/psy-eval/psy-eval.git
You will need python2.7, numpy, scipy, matplotlib, and also ffmpeg (or
possibly libav).
You also need a wav file with resampler response, and, optionally, a
recording of room noise, also as a 16-bit uncompressed wav file. See
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-September/021811.html
how to obtain these wav files, or use pre-generated ones:
https://yadi.sk/d/RzV7JGAxbfUve (the same archive as used in the
previous email)
So, the new script is ./music_distortions.py , and it takes the
following arguments:
--resampler-response: the wav file with resampler response to a linear
frequency sweep. You can use "speex-float-1.wav".
--rate-from: the sample rate that the sine sweep was resampled from. For
files in my archive, that's 44100.
--skip: if the resampler response contains junk in the beginning, use
this to skip a specified number of samples.
--fftsize: the FFT size, at the target sample rate. Useful values are
1024 - 8192.
--noise-file: wav file with room noise. Optional.
--noise-full-scale: if you recorded room noise with a calibrated
microphone and sound card, then you know the dB SPL value corresponding
to a full-scale sine wave. Put it here. The default is 92, but you need
84 in order to use the noise file from the archive.
--noise-dba: if you have a noise meter instead, put its reading (with
the "A" setting) here. If you have nether a calibrated microphone nor a
noise meter, but want to use your own noise file, put 35 here.
--signal-full-scale: If you know the sound pressure level corresponding
to the full-scale sine wave at your soundcard output, put it here, in
dB. The default is 92.
--use-eq: Use this switch to ignore the fact that resampler attenuates
high frequencies (with the implication that a human can notice this
distortion if he/she knows that they should be there).
--save: if you want to save the plot, put a prefix of its name here.
_audibility_vs_time.png will be appended. If you don't specify this, the
plot will be shown instead.
--report-only: don't plot anything, just report the average distortion,
the maximum distortion, and where it happens.
Finally, specify the music file name. That file should be in any format
supported by ffmpeg, and should have the same sample rate as --rate-from
says. Only the front left channel will be taken into account.
E.g.:
./music_distortions.py --signal-full-scale 72 --fftsize 1024
--resampler-response speex-float-0.wav --rate-from 44100 --skip 65536
--save Prelude Prelude.wav
produces (together with some warnings):
"Prelude.wav", average distortion = -8.8 dB, maximum = -2.2 dB, at 4:33
and the attached plot. If the curve is below 0 dB, an average human
cannot notice the distortions. If it is above, then the distortion can
be noticed, provided that the subject knows how the file should sound
with the ideal resampler.
I do have some music files where the script at its default settings
finds speex-float-1 marginally adequate (i.e. maximum audibility of
distortions is close to 0 dB), or even not adequate with non-default FFT
size (2048 or 4096) [*]. In all such (rare) cases, --signal-full-scale
72 removes the complaint. Probably that's because the complaint is
really about some nearly-ultrasonic frequency component that got
rejected by the resampler in the first case and sank below the absolute
threshold of hearing when the volume was reduced in the second case.
For those who want to test, here are the affected New Age albums:
Ryan Farish - Everlasting
Australis - The Gates of Reality
Daveed - Songs From Beyond
Interestingly, the "average" figure is worse on speech material (such as
foreign language courses) than on music.
[*] The FFT size dependency is, strictly speaking, a bug. This is
probably related to the use of a narrow (low-noise) window without
sufficient overlap, so the bad fragment just slips through the gap
between the two neighboring positions of the 1024-sample window. Still,
the average figure is stable when changing the FFT size.
P.S. Tomorrow I have a flight to France (due to XDC 2014), so I won't be
able to answer your questions quickly.
--
Alexander E. Patrakov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Prelude_audibility_vs_time.png
Type: image/png
Size: 51902 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20141005/5fb37801/attachment-0001.png>
More information about the pulseaudio-discuss
mailing list