[Liboil] [patch] Optimized multsum_f64
David Schleef
ds at schleef.org
Wed May 17 11:21:47 PDT 2006
On Wed, May 17, 2006 at 02:06:53PM -0400, Marcus Brubaker wrote:
> Interesting, it's on a Pentium M laptop so I guess that's not that
> surprising. There may also be some inefficiencies in loading the data
> as well, but that can only be addressed in an unstrided context.
Ah, right. I thought that I had deprecated all the strided classes.
There's really not much point in writing code for the current
multsum_f64, since it pretty much can't go faster.
> So if I wanted to add an unstrided version of multsum or a strided
> version of some other function what would be the preferred naming
> convention?
multsum_f64_ns()
There are several simple operations (add, multiply, multsum, etc.)
that need to be extended over all types, or at least the common types
(f32, f64, and s16).
> Also, what is the status of the vectoradd functions? They're documented
> as being too hard to optimize and thus deprecated. It seems that
> they're overly complicated and something without the s[34]_1 parameters
> would be easier to optimize and fairly useful (at least to me). Are
> there plans to rectify this? If not, I will be happy to do what I can
> given a bit of guidance on naming.
The s3_1 and s4_1 are important because it's *vector* addition. The
strides are the problem. You are probably looking for something like
add_f64().
dave...
--
David Schleef
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
More information about the Liboil
mailing list