[pulseaudio-discuss] On scaling the HRIR in module-virtual-surround-sink
Alexander E. Patrakov
patrakov at gmail.com
Sat Mar 8 10:57:29 PST 2014
Hello.
[Do not blindly apply patches from this e-mail! They mutually exclusive,
and I don't have a firm opinion which one of them is correct.]
Today I tried to improve the existing module-virtual-surround-sink (but
the same issue also affects the IIR-based rewrite that is still sitting
on my laptop). The problem is: the current normalization code does not
do what it is designed to do. The module clips on some testcases. Let me
copy-paste the problematic code for easy discussion.
/* normalize hrir to avoid clipping */
hrir_max = 0;
for (i = 0; i < u->hrir_samples; i++) {
hrir_sum = 0;
for (j = 0; j < u->hrir_channels; j++)
hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);
if (hrir_sum > hrir_max)
hrir_max = hrir_sum;
}
if (hrir_max > 1) {
for (i = 0; i < u->hrir_samples; i++) {
for (j = 0; j < u->hrir_channels; j++)
u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
}
}
And here is where clipping (i.e. sum_left or sum_right becoming > 1.0)
can happen:
sum_right = 0;
sum_left = 0;
/* fold the input buffer with the impulse response */
for (j = 0; j < u->hrir_samples; j++) {
for (k = 0; k < u->channels; k++) {
current_sample =
u->input_buffer[((u->input_buffer_offset + j) % u->hrir_samples) *
u->channels + k];
sum_left += current_sample * u->hrir_data[j *
u->hrir_channels + u->mapping_left[k]];
sum_right += current_sample * u->hrir_data[j *
u->hrir_channels + u->mapping_right[k]];
}
}
So, basically, here is the "normalization" logic: for each HRIR sample,
take the sum of all absolute values of HRIR channels at that time. Find
the maximum of these sums. Divide that factor out, with an extra "fudge
factor" of 1.2. Without the fudge factor, the corresponding explanation
in terms of the second code fragment is: if a single-sample click is
played simultaneously through all 5.1 channels, the output of the filter
(sum_left or sum_right) should just reach 1.0 at its maximum. With
hrir-kemar.wav from http://stuff.salscheider-online.de/hrir_kemar.tar.gz
(mentioned in the original submission), this logic (including the fudge
factor) divides the HRIR by 1.862.
What I don't understand is why "simultaneous click in all input
channels" is considered at all.
If the task is to avoid clipping even in the worst possible case, then
let's construct this worst possible case. Here are the instructions.
Take hrir-kemar.wav. Time-reverse it. Saturate all positive samples to
1.0 and all negative samples to -1.0. With this specially-constructed
sound, sum_left would at one point in time receive the sum of all
absolute values of all samples (in all channels) in the HRIR, so, that
is the value to divide out, and not what is written in the first code
fragment. I.e., this would be the patch (sorry if mangled, it is not
meant to be applied anyway):
--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -533,7 +533,7 @@ int pa__init(pa_module*m) {
const char *hrir_file;
unsigned i, j, found_channel_left, found_channel_right;
- float hrir_sum, hrir_max;
+ float hrir_sum;
float *hrir_data;
pa_sample_spec hrir_ss;
@@ -759,20 +759,21 @@ int pa__init(pa_module*m) {
}
/* normalize hrir to avoid clipping */
- hrir_max = 0;
+ hrir_sum = 0.0;
for (i = 0; i < u->hrir_samples; i++) {
- hrir_sum = 0;
for (j = 0; j < u->hrir_channels; j++)
hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);
+ }
+
- if (hrir_sum > hrir_max)
- hrir_max = hrir_sum;
+ if (hrir_sum < 0.01) {
+ pa_log("hrir file is too quiet!");
+ goto fail;
}
- if (hrir_max > 1) {
- for (i = 0; i < u->hrir_samples; i++) {
- for (j = 0; j < u->hrir_channels; j++)
- u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
- }
+
+ for (i = 0; i < u->hrir_samples; i++) {
+ for (j = 0; j < u->hrir_channels; j++)
+ u->hrir_data[i * u->hrir_channels + j] /= hrir_sum;
}
/* create mapping between hrir and input */
(Here I also removed the "don't normalize quiet HRIRs" logic, as I don't
understand its purpose and it had an obvious bug of not including the
fudge factor.)
Result: after this patch, the HRIR is divided by 24.339 and the module
indeed survives the worst possible testcase just without clipping. But
then, it becomes too quiet for normal use. That's why I don't want you
to apply the above patch.
The original code, even though it clips heavily in the worst possible
testcase, indeed produces no audible clipping on most (but not all!)
music DVDs that I tried, and I think that the correct problem to solve
here is indeed "produce no audible clipping on typical content". So
there is some grain of truth in the current code.
Still, I want to change the fudge factor, because, as of now, the module
does not survive the following testcases (i.e. clips, with this clipping
being very easy to detect by ear):
* A full-amplitude sine sweep from 100 Hz to 10 kHz, same samples in
all channels.
* The 5.1 soundtrack from the "Lichtmond 2: Universe of Light" music
BluRay.
The fudge factor required to play the whole Lichtmond 2 soundtrack
without clipping is 2.4, corresponding to dividing the HRIR by 3.724. So:
--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -768,11 +768,15 @@ int pa__init(pa_module*m) {
if (hrir_sum > hrir_max)
hrir_max = hrir_sum;
}
- if (hrir_max > 1) {
- for (i = 0; i < u->hrir_samples; i++) {
- for (j = 0; j < u->hrir_channels; j++)
- u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
- }
+
+ if (hrir_max < 0.01) {
+ pa_log("hrir file is too quiet!");
+ goto fail;
+ }
+
+ for (i = 0; i < u->hrir_samples; i++) {
+ for (j = 0; j < u->hrir_channels; j++)
+ u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 2.4;
}
/* create mapping between hrir and input */
I don't like the above (working and almost minimal) patch, as it just
replaces one magic number with another, and I still don't understand why
the calculated hrir_max value is relevant. I.e. the same "correct"
scaling factor, 3.724, can be also approximately produced this way, from
the energy contained in the impulse response:
--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -533,7 +533,7 @@ int pa__init(pa_module*m) {
const char *hrir_file;
unsigned i, j, found_channel_left, found_channel_right;
- float hrir_sum, hrir_max;
+ float hrir_sample, hrir_energy;
float *hrir_data;
pa_sample_spec hrir_ss;
@@ -758,21 +758,25 @@ int pa__init(pa_module*m) {
goto fail;
}
- /* normalize hrir to avoid clipping */
- hrir_max = 0;
+ /* normalizg hrir to avoid clipping on typical music */
+ hrir_energy = 0.0;
for (i = 0; i < u->hrir_samples; i++) {
- hrir_sum = 0;
- for (j = 0; j < u->hrir_channels; j++)
- hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);
+ for (j = 0; j < u->hrir_channels; j++) {
+ hrir_sample = u->hrir_data[i * u->hrir_channels + j];
+ hrir_energy += hrir_sample * hrir_sample;
+ }
+ }
- if (hrir_sum > hrir_max)
- hrir_max = hrir_sum;
+ if (hrir_energy < 0.0001) {
+ pa_log("hrir file is too quiet!");
+ goto fail;
}
- if (hrir_max > 1) {
- for (i = 0; i < u->hrir_samples; i++) {
- for (j = 0; j < u->hrir_channels; j++)
- u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
- }
+
+ hrir_sample = sqrt(hrir_energy) * 1.67;
+
+ for (i = 0; i < u->hrir_samples; i++) {
+ for (j = 0; j < u->hrir_channels; j++)
+ u->hrir_data[i * u->hrir_channels + j] /= hrir_sample;
}
/* create mapping between hrir and input */
Of course, the same criticism applies as for the second patch - why
would the energy be relevant here?
Some final remarks.
I don't insist on the 3.724 scaling factor and can accept occasional
clipping if it is not detectable by ear.
Due to the fact that we are dealing with "typical content", I understand
that a fudge factor is needed anyway, to deal with its typicalness.
The code has to deal with HRIRs of arbitrary origin, including those
from http://stuff.salscheider-online.de/hrir_listen.tar.gz , that's why
a static factor to divide out won't cut it. But I haven't tested any
other HRIRs yet.
Wrong equalization of the HRIRs (i.e. expensive headphones sounding like
cheap plugs with this module) is a separate bug that will be addressed
in a separate e-mail when I become able to discuss it.
Thoughts?
--
Alexander E. Patrakov
More information about the pulseaudio-discuss
mailing list