[pulseaudio-discuss] ARM NEON optimized code

Arun Raghavan arun.raghavan at collabora.co.uk
Wed Jan 11 06:53:24 PST 2012


On Mon, 2012-01-09 at 12:14 +0100, Peter Meerwald wrote:
> Hello,
> 
> I am about to prepare some ARM NEON optimized code for PulseAudio; 
> attached is a stand-alone test program demonstrating 
> sconv_s16le_from_float() and sconv_s16le_to_float() on 1019 samples
> 
> questions:
> is it acceptable to use ARM NEON intrinsics?
> or is __asm__ __volatile or assembler source preferred? 
> or Orc code?

My opinion on this is that we pick the one which performs best, and when
the solutions are comparable, pick the most easily maintained (Orc,
intrinsics, inline assembly in decreasing order of maintainability

> I picked intrinsics due to simplicity... the generated code (gcc-4.6, 
> -O2) looks clean
[...]
> # ./sconv_neon 
> checking NEON sconv_s16le_from_float(2038)
> NEON: 3723 usec.
> ref: 64516 usec.
> checking NEON sconv_s16le_to_float(2038)
> NEON: 1923 usec.
> ref: 18280 usec.
> 
> runtime is for 1000 repetitions on a Beagleboard-XM (NEON vs. reference C 
> code)

That is neat!

> if it looks OK to you, I'll go ahead and submit patches to integrate with 
> PA...
> 
> regards, p.

Cheers,
Arun



More information about the pulseaudio-discuss mailing list