[pulseaudio-discuss] [PATCH v2 1/6] core: add ARM NEON optimized mono-to-stereo/stereo-to-mono remapping code
Peter Meerwald
pmeerw at pmeerw.net
Sun Mar 11 11:12:36 PDT 2012
Hello Arun,
thank you for testing; you are covering areas I have not thought of :)
> > v2:
> > * add ARM NEON stereo-to-mono remapping code
> > * static __attribute__ ((noinline)) is necessary to prevent inlining and
> > work around gcc 4.6 ICE, see https://bugs.launchpad.net/bugs/936863
> > * call test code, the reference implementation is obtained using
> > pa_get_init_remap_func()
> > * remove check for NEON flags
> > v1:
> > * ARM NEON mono-to-stereo remapping code
> A couple of issues here. Firstly, if I turn up the test loop count to
> 100000, I can fairly reliably see a bunch of failures like the one
> below.
I have not been able to reproduce the issue so far; my setup is beagle-xm
with softfp, the particular compiler might also be different
you get different results for remap_stereo_to_mono, s16
are you testing the patches one-by-one? i.e. you have not yet applied
'[PATCH 5/6] core: add stereo to mono special case remapping'?
I think there is an issue in run_test_s16_stereo_to_mono() when setting up
the remap structure:
remap.format = &sf;
iss.format = PA_SAMPLE_S16NE;
iss.channels = 2;
oss.format = PA_SAMPLE_S16NE;
oss.channels = 1;
remap.i_ss = &iss;
remap.o_ss = &oss;
remap.map_table_f[0][0] = 1.0;
remap.map_table_f[0][1] = 1.0;
remap_init_func(&remap);
what is missing is the map table for int values:
remap.map_table_i[0][0] = 0x10000;
remap.map_table_i[0][1] = 0x10000;
in run_test_s16_mono_to_stereo() there is a similar issue, but here
init_remap_c() in remap.c will use the remap_mono_to_stereo_c instead of
the generic mapper; remap_mono_to_stereo_c does not need the map_table
in '[PATCH 5/6] core: add stereo to mono special case remapping' a special
case is added to init_remap_c() to cover the stereo-to-mono case which
might mask the issue
... but I am just speculating here :(
could you try above initialization?
> Next, I see the reference implementation doing better in the
> mono-to-stereo float remapping.
the C code seems a lot more efficient on the panda, it is well known that
floats are handled better
NEON remap_mono_to_stereo(float) is ~ 4x slower on panda, and only
slightly faster on beagle
remap_mono_to_stereo(s16) is ~ 1.5x slower on panda, and somewhat faster
on beagle
I'll try to build a hardfp system and see if the runtime can be improved
(or at least avoid regressions)
regards, p.
--
Peter Meerwald
+43-664-2444418 (mobile)
More information about the pulseaudio-discuss
mailing list