Hi All,
        I am very new to ORC and wanted to use it for some optimizations over NEON.
To start with I just wrote a simple float array addition program using ORC. 
 the normally compiled version (which uses simple addition) works much 
much faster than the NEON version. I am not sure if I was able to 
utilize ORC properly or not.

C Code:
        float a[10];
        float b[10];
        float c[10];
        int i;
                //printf("\n\na[%d]=%f b[%d]=%f c[%d]=%f \n",i,a[i],i,b[i],i,c[i]);

orc file for the addition:
.function add_s32
.dest 4 d1 float
.source 4 s1 float
.source 4 s2 float

addf d1, s1, s2

C file using ORC generated function: (orcc --implementation add.orc)
after initialization I wrote 


Now when I run the ORC binary with ORC_DEBUG=3 I get the statement "compiling for target "neon" " which makes me beleive that ORC is working for correct platform.
But when I run both the versions the normal addition is working much better (more than 100 times better) than the ORC code.

Am I missing sumthing or else where exactly can ORC help me optimze my module.

Thanks for you time


