[Liboil] [patch] Optimized multsum_f64

David Schleef ds at schleef.org
Wed May 17 11:21:47 PDT 2006


On Wed, May 17, 2006 at 02:06:53PM -0400, Marcus Brubaker wrote:
> Interesting, it's on a Pentium M laptop so I guess that's not that 
> surprising.  There may also be some inefficiencies in loading the data 
> as well, but that can only be addressed in an unstrided context.

Ah, right.  I thought that I had deprecated all the strided classes.
There's really not much point in writing code for the current
multsum_f64, since it pretty much can't go faster.

> So if I wanted to add an unstrided version of multsum or a strided 
> version of some other function what would be the preferred naming 
> convention?

multsum_f64_ns()

There are several simple operations (add, multiply, multsum, etc.)
that need to be extended over all types, or at least the common types
(f32, f64, and s16).

> Also, what is the status of the vectoradd functions?  They're documented 
> as being too hard to optimize and thus deprecated.  It seems that 
> they're overly complicated and something without the s[34]_1 parameters 
> would be easier to optimize and fairly useful (at least to me).  Are 
> there plans to rectify this?  If not, I will be happy to do what I can 
> given a bit of guidance on naming.

The s3_1 and s4_1 are important because it's *vector* addition.  The
strides are the problem.  You are probably looking for something like
add_f64().



dave...

-- 
David Schleef
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux


More information about the Liboil mailing list