sse2: Add a fast path for OVER 8888 x 8 x 8888

Matt Turner mattst88 at gmail.com
Wed Nov 11 13:03:08 PST 2009


On Tue, Nov 10, 2009 at 6:39 PM, Soeren Sandmann <sandmann at daimi.au.dk> wrote:
> Hi,
>
> Here:
>
>    http://cgit.freedesktop.org/~sandmann/pixman/commit/?h=sse_8888_8_8888
>
> is a patch that adds an sse2 8888 x 8 x 8888 fast path. This is a
> small speedup on the swfdec-youtube benchmark:
>
> Before:
> [  0]    image       swfdec-youtube    5.789    5.806   0.20%    6/6
>
> After:
> [  0]    image       swfdec-youtube    5.489    5.524   0.27%    6/6
>
> Ie., approximately 5% faster.
>
> Please check that I didn't miss anything.

I asked on the flatassember.net forums for a review of this code. See
http://board.flatassembler.net/topic.php?t=10839#104485

As mentioned in the thread, what kind of performance difference do you
have if you move the cache prefetch outside of the main loop and
remove it elsewhere?

Please check out the other suggestions.

Thanks,
Matt


More information about the xorg-devel mailing list