[Liboil] [patch] Optimized multsum_f64

Marcus Brubaker aurelius.marcus at rogers.com
Wed May 17 11:06:53 PDT 2006


David Schleef wrote:
> On Mon, May 15, 2006 at 09:23:16PM -0400, Marcus Brubaker wrote:
>   
>> Here are two optimized versions of multsum_f64 and a patch for detecting 
>> SSE2 support.  For some reason, the SSE2 version is slightly slower on 
>> my machine than the plain unrolled version.  I'm not exactly an assembly 
>> wizard so I may be missing something obvious, suggestions welcome.
>>     
>
> That happens sometimes.  You may have one of those processors where
> f64 ops are kinda slow.  Some CPUs only have one or two FP multiply
> units that get shared between SSE2 and the FPU, so it doesn't really
> matter whether you use SSE2 or the FPU.
>   

Interesting, it's on a Pentium M laptop so I guess that's not that 
surprising.  There may also be some inefficiencies in loading the data 
as well, but that can only be addressed in an unstrided context.

So if I wanted to add an unstrided version of multsum or a strided 
version of some other function what would be the preferred naming 
convention?

Also, what is the status of the vectoradd functions?  They're documented 
as being too hard to optimize and thus deprecated.  It seems that 
they're overly complicated and something without the s[34]_1 parameters 
would be easier to optimize and fairly useful (at least to me).  Are 
there plans to rectify this?  If not, I will be happy to do what I can 
given a bit of guidance on naming.

>> This is the first time I've created a patch for a project in a long 
>> time, so please let me know if I've missed something.  The patch was 
>> created using 'cvs diff -uNp' versus the latest anonymous CVS.
>>     
>
> Sounds good to me.  I usually suggest doing a 'cvs add' on the files
> you are adding (which curiously, does not require CVS write access),
> and then just use 'cvs diff -u'.  Using '-N' may have put lots of
> other files in the patch.
>
> Please attach the patch to a bug report on bugs.freedesktop.org.
>   

Done: https://bugs.freedesktop.org/show_bug.cgi?id=6957

Cheers,
Marcus


More information about the Liboil mailing list