[pulseaudio-discuss] Resampler quality evaluation results
Laurențiu Nicola
lnicola at dend.ro
Tue Sep 16 04:36:24 PDT 2014
Thanks a lot. For the record, it seems that:
1. Resampling from closer rates yields less distortion that from
rates that are far apart.
2. Upsampling distorts more than downsampling.
3. speex-float-3 gives audible distortion even for close rates.
I suppose I these results were to be expected, but it's still nice to
have confirmation.
Now if I only could check the performance of speex-fixed vs. speex-float
on my platform.. :).
Laurentiu Nicola
On Sun, Sep 14, 2014, at 16:34, Alexander E. Patrakov wrote:
> 07.09.2014 17:04, Laurențiu Nicola wrote:
> > Great, thanks!
> >
> > On Sun, Sep 7, 2014, at 14:02, Alexander E. Patrakov wrote:
> >> 07.09.2014 16:58, Laurențiu Nicola wrote:
> >>> I have a question related to your tests. In my application, I need
> >>> resampling between close rates (let's say from 44200 to 44100). Do you
> >>> feel that the results would basically be the same in this kind of
> >>> situation?
> >>>
> >>> Thanks,
> >>> Laurentiu Nicola
> >>
> >> I have not tested. I will write instructions on the next week, so that
> >> you can test yourself every situation that you want to.
>
> Here are the instructions. Sorry for the delay.
>
> 1. Choose a sample rate that you will be resampling from. In your case,
> this is 44200 Hz. Generate a wav file with this sample rate, containing
> a linear sweep:
>
> ./wavegen.py --rate 44200 --length $(( 1024 * 1024 )) --amplitude 0.99
> --format s16 --padding 65536 44200.wav
>
> The length should be sufficient so that for every frequency bin of the
> FFT on the next steps the file contains a piece of sufficient-for-FFT
> length (or ideally several such pieces) with only that frequency. The
> amplitude should be 0.99 to avoid accidental clipping by the resampler.
> The padding is unfortunately needed because the recorded wav file from
> the resampler on the next step contains unwanted clicks for some unknown
> reason.
>
> 2. Resample the file. An easy but slow way is to use a null sink running
> at the rate you want to resample to. Here are the commands:
>
> pacmd load-module module-null-sink rate=44100
>
> parec -d null.monitor --fix-rate --file-format=wav 44200_to_44100.wav &
> paplay -d null 44200.wav ; killall parec
>
> Hopefully these commands are obvious.
>
> 3. Analyze the result.
>
> ./resampler_plots.py --rate-from 44200 --skip 32768 --save plop
> --fftsize 1024 44200_to_44100.wav
>
> There may be warnings about dropouts. If they are near the end, that's
> OK. Also there will be warnings about division by zero, that's because
> of the masked-out frequency components. Ignore them.
>
> The meaning of parameters: rate-from is the sample rate of the original
> wav file that contained the linear sweep, skip means "skip this number
> of samples from the beginning" (because there is a click). After
> skipping, the analysis process skips further through the silent portion
> of the resampled file and automatically adapts to any unknown slope of
> frequency change.
>
> The "fftsize" parameter, well, sets the FFT size. Useful values start
> from 1024. Below that, the resolution in the low-frequency part of the
> spectrum is not sufficient to determine audibility reliably, because the
> absolute threshold of hearing changes significantly within one frequency
> bin. The FFT size is specified in terms of the frequency bins. The
> required number of input samples for each signal piece is twice more,
> i.e. 2048 in our case.
>
> The "save" parameter sets a base of all output filenames. So, you'll get:
>
> plop_envelope.png: shows the amplitude of output signal vs frequency if
> the input signal contains only this frequency at the full scale.
>
> plop_response.png: a spectrogram. To read it, select an input frequency.
> Then cut a column out of this spectrogram according ot the X axis. The
> amplitude of each output frequency component is then described by the
> color of the column at the height corresponding to the output frequency.
> E.g., it can be seen that, when given a 5 kHz input signal, the
> src-sinc-fastest resampler also produces some very weak unwanted output
> at 18 kHz.
>
> plop_distortion_eq.png: the same, but with the line representing the
> wanted same-frequency output suppressed. I.e. only distortions, with the
> assumption that wanted-signal attenuation does not count as a distortion.
>
> plop_distortion.png: the same, but with the same line replaced by the
> difference of wanted vs actual same-frequency output. I.e. only
> distortions, and the attenuation of the signal now counts as a
> distortion.
>
> plop_audibility_eq.png: audibility of distortions (i.e. how much should
> one reduce the distortion before it becomes inaudible) given the input
> signal containing only this frequency at the maximum amplitude, if
> attenuation of high signal frequencies does not count as a distortion.
>
> plop_audibility.png: the same, but now such attenuation counts as a
> distortion.
>
> The results are valid only in the absolutely quiet room, i.e. represent
> the worst case.
>
> One of the plots from src-sinc-fastest is attached for you to compare.
>
> --
> Alexander E. Patrakov
> Email had 1 attachment:
> + plop_audibility_eq.png
> 65k (image/png)
More information about the pulseaudio-discuss
mailing list