[pulseaudio-discuss] Resampler quality evaluation results

Sun Sep 14 06:34:54 PDT 2014

07.09.2014 17:04, Laurențiu Nicola wrote:
> Great, thanks!
>
> On Sun, Sep 7, 2014, at 14:02, Alexander E. Patrakov wrote:
>> 07.09.2014 16:58, Laurențiu Nicola wrote:
>>> I have a question related to your tests. In my application, I need
>>> resampling between close rates (let's say from 44200 to 44100). Do you
>>> feel that the results would basically be the same in this kind of
>>> situation?
>>>
>>> Thanks,
>>> Laurentiu Nicola
>>
>> I have not tested. I will write instructions on the next week, so that
>> you can test yourself every situation that you want to.

Here are the instructions. Sorry for the delay.

1. Choose a sample rate that you will be resampling from. In your case, 
this is 44200 Hz. Generate a wav file with this sample rate, containing 
a linear sweep:

./wavegen.py --rate 44200 --length $(( 1024 * 1024 )) --amplitude 0.99 
--format s16 --padding 65536 44200.wav

The length should be sufficient so that for every frequency bin of the 
FFT on the next steps the file contains a piece of sufficient-for-FFT 
length (or ideally several such pieces) with only that frequency. The 
amplitude should be 0.99 to avoid accidental clipping by the resampler. 
The padding is unfortunately needed because the recorded wav file from 
the resampler on the next step contains unwanted clicks for some unknown 
reason.

2. Resample the file. An easy but slow way is to use a null sink running 
at the rate you want to resample to. Here are the commands:

pacmd load-module module-null-sink rate=44100

parec -d null.monitor --fix-rate --file-format=wav 44200_to_44100.wav & 
paplay -d null 44200.wav ; killall parec

Hopefully these commands are obvious.

3. Analyze the result.

./resampler_plots.py --rate-from 44200 --skip 32768 --save plop 
--fftsize 1024 44200_to_44100.wav

There may be warnings about dropouts. If they are near the end, that's 
OK. Also there will be warnings about division by zero, that's because 
of the masked-out frequency components. Ignore them.

The meaning of parameters: rate-from is the sample rate of the original 
wav file that contained the linear sweep, skip means "skip this number 
of samples from the beginning" (because there is a click). After 
skipping, the analysis process skips further through the silent portion 
of the resampled file and automatically adapts to any unknown slope of 
frequency change.

The "fftsize" parameter, well, sets the FFT size. Useful values start 
from 1024. Below that, the resolution in the low-frequency part of the 
spectrum is not sufficient to determine audibility reliably, because the 
absolute threshold of hearing changes significantly within one frequency 
bin. The FFT size is specified in terms of the frequency bins. The 
required number of input samples for each signal piece is twice more, 
i.e. 2048 in our case.

The "save" parameter sets a base of all output filenames. So, you'll get:

plop_envelope.png: shows the amplitude of output signal vs frequency if 
the input signal contains only this frequency at the full scale.

plop_response.png: a spectrogram. To read it, select an input frequency. 
Then cut a column out of this spectrogram according ot the X axis. The 
amplitude of each output frequency component is then described by the 
color of the column at the height corresponding to the output frequency. 
E.g., it can be seen that, when given a 5 kHz input signal, the 
src-sinc-fastest resampler also produces some very weak unwanted output 
at 18 kHz.

plop_distortion_eq.png: the same, but with the line representing the 
wanted same-frequency output suppressed. I.e. only distortions, with the 
assumption that wanted-signal attenuation does not count as a distortion.

plop_distortion.png: the same, but with the same line replaced by the 
difference of wanted vs actual same-frequency output. I.e. only 
distortions, and the attenuation of the signal now counts as a distortion.

plop_audibility_eq.png: audibility of distortions (i.e. how much should 
one reduce the distortion before it becomes inaudible) given the input 
signal containing only this frequency at the maximum amplitude, if 
attenuation of high signal frequencies does not count as a distortion.

plop_audibility.png: the same, but now such attenuation counts as a 
distortion.

The results are valid only in the absolutely quiet room, i.e. represent 
the worst case.

One of the plots from src-sinc-fastest is attached for you to compare.

-- 
Alexander E. Patrakov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plop_audibility_eq.png
Type: image/png
Size: 48887 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20140914/0df7a18a/attachment-0001.png>