[Liboil] [patch] Optimized multsum_f64
ds at schleef.org
Wed May 17 11:21:47 PDT 2006
On Wed, May 17, 2006 at 02:06:53PM -0400, Marcus Brubaker wrote:
> Interesting, it's on a Pentium M laptop so I guess that's not that
> surprising. There may also be some inefficiencies in loading the data
> as well, but that can only be addressed in an unstrided context.
Ah, right. I thought that I had deprecated all the strided classes.
There's really not much point in writing code for the current
multsum_f64, since it pretty much can't go faster.
> So if I wanted to add an unstrided version of multsum or a strided
> version of some other function what would be the preferred naming
There are several simple operations (add, multiply, multsum, etc.)
that need to be extended over all types, or at least the common types
(f32, f64, and s16).
> Also, what is the status of the vectoradd functions? They're documented
> as being too hard to optimize and thus deprecated. It seems that
> they're overly complicated and something without the s_1 parameters
> would be easier to optimize and fairly useful (at least to me). Are
> there plans to rectify this? If not, I will be happy to do what I can
> given a bit of guidance on naming.
The s3_1 and s4_1 are important because it's *vector* addition. The
strides are the problem. You are probably looking for something like
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
More information about the Liboil