[Liboil] Fw: ORC performance for NEON
David Schleef
ds at entropywave.com
Fri Jan 7 13:20:47 PST 2011
On Thu, Jan 06, 2011 at 11:27:13AM +0530, Prateek Mathur wrote:
> Hi David,
>
> Thanks for your reply. I have tried running the code many times over but the range of the time taken remains the same. For measuring time I have also checked with clock_gettime which also suggests the same result.
>
> I am not sure if replacing for loop by orcc generated C function is
> the best way to get optimization and this is where I need some help.
> If that is not the case should I then go to intrinsics?
You haven't really explained what you are doing and/or what you want
to do. Calling an Orc function is always going to involve some
overhead, for the function call and also setting up the array
handling, so you need to balance that with how many operations you
are doing and the size of the arrays.
N=10 is not a very large array, and individual floating point operations
don't really gain much from SIMD. So I'm not really surprised that
it doesn't go much faster, if at all.
David
More information about the Liboil
mailing list