[pulseaudio-discuss] New dependency: Orc

Arun Raghavan arun.raghavan at collabora.co.uk
Wed Oct 27 07:10:28 PDT 2010

Hi folks,
I've been doing some work optimising the software volume scaling code,
and along with my previous changes to decrease the maximum volume to
2^31-1, there seems to be a pretty good performance increase (almost 2x
on my Core2 processor).

The actual optimisations have been written in Orc[1], which is a
language to write simple "functions" that get translated to SIMD
instructions at runtime.

I should have sent this out a while back, since we're actually using Orc
for one of the echo-cancellation modules that was merged to master, but
now that there could be core code using this, I thought I'd get more
thoughts on making Orc an optional dependency of PulseAudio.

The way I've written things right now, the old C and hand-rolled
assembly is still there. Only when Orc support is enabled, and we're on
a CPU where the Orc code is known to be faster, we use the Orc code.
I've only written the mono and stereo S16NE functions so far, so for
other formats, the old code is used. If you don't have or don't want to
use Orc, it can be disabled at configure time (--disable-orc).

If you do enable it, there are a couple of generated files generate for
each Orc source program. These actually even contain C fallback for when
the system you're on doesn't have Orc or that Orc doesn't have a backend
for. At some point, if the fallback C code and the Orc functions become
good enough to replace everything else, we can look at just using these
to replace all the other implementations. That day isn't today,
though. :)

The code is at: http://git.collabora.co.uk/?p=user/arun/pulseaudio.git -
there are also some fixes to the various volume scaling test code.

Comments/brickbats solicited :)


[1] http://code.entropywave.com/projects/orc/

p.s.: I've not tried out the Orc code on ARM (with or without NEON), so
if anyone wants to give that a whirl before I get around to it, please
do posts the results here. The hand-rolled should be faster for now
since it uses a single instruction for the multiply+shift operation.

More information about the pulseaudio-discuss mailing list