[Liboil] Fw: ORC performance for NEON
Prateek Mathur
hiprateek007 at yahoo.co.in
Tue Jan 4 00:58:37 PST 2011
Srry fr earlier incomplete mail..
Hi All,
I am very new to ORC and wanted to use it for some optimizations over NEON.
To start with I just wrote a simple float array addition program using ORC.
Surprisingly
the normally compiled version (which uses simple addition) works much
much faster than the NEON version. I am not sure if I was able to
utilize ORC properly or not.
C Code:
float a[10];
float b[10];
float c[10];
int i;
//initialization
for(i=0;i<10;i++)
{
a[i]=3.14159*100*(i+1);
b[i]=5.00956*10*i;
}
for(i=0;i<10;i++)
{
c[i]=a[i]+b[i];
//printf("\n\na[%d]=%f b[%d]=%f c[%d]=%f \n",i,a[i],i,b[i],i,c[i]);
}
orc file for the addition:
.function add_s32
.dest 4 d1 float
.source 4 s1 float
.source 4 s2 float
addf d1, s1, s2
C file using ORC generated function: (orcc --implementation add.orc)
after initialization I wrote
add_s32(c,a,b,10);
Now when I run the ORC binary with ORC_DEBUG=3 I get the statement "compiling for target "neon" " which makes me beleive that ORC is working for correct platform.
But when I run both the versions the normal addition is working much better (more than 100 times better) than the ORC code.
Am I missing sumthing or else where exactly can ORC help me optimze my module.
Thanks for you time
Prateek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/liboil/attachments/20110104/b8663f06/attachment.htm>
More information about the Liboil
mailing list