[Pixman] [PATCH] sse2: Using MMX and SSE 4.1

Fri Jun 15 12:51:38 PDT 2012

Matt Turner <mattst88 at gmail.com> writes:

> The registers -- yes. The 8-byte aligned loads and stores I'm not
> sure. Can you do 8-byte aligned loads and stores to/from SSE
> registers?

I believe movq can use SSE registers.

> Indeed, runtime generation would be great. Something like LLVM or orc
> would be interesting options. I'm not sure I'm up to that kind of
> project yet/now though.
>
> I think adding pixman-sse*.c files is a reasonable measure for now.
> Think it's okay to split the static inline support functions from
> pixman-sse2.c out into a header to be shared with the other
> pixman-sse*.c files?

Sounds reasonable to me.

> Also, are we planning to change the bilinear scaling algorithm for
> 0.28 so that we can use pmaddubsw?

I wouldn't object to a patch that dropped precision to 7 bits for all
bilinear code, but it would require changes at least to the general
code, the fast path code, the NEON code and the SSE2 code.

An alternative idea is instead of changing the algorithm across the
board, we could stop requiring bit exact results. The main piece of work
here is to change the test suite so that it will accept pixels up to
some maximum relative error. There is already some support for this: the
'composite' test is using the 'pixel_checker_t" to do compare the pixman
output with perfect pixels computed in double precision.

This latter idea is ultimately more useful because it will allow much
more flexibility in the kinds of SIMD instruction sets we can support.

Søren