[Pixman] [RFC PATCH] mmx: Use shuffle instruction when available
mattst88 at gmail.com
Mon Feb 13 14:02:21 PST 2012
On Mon, Feb 13, 2012 at 4:21 PM, Søren Sandmann <sandmann at cs.au.dk> wrote:
> sandmann at cs.au.dk (Søren Sandmann) writes:
>> Matt Turner <mattst88 at gmail.com> writes:
>>> Although not part of the original MMX instruction set, both SSE and
>>> AMD's Extended 3DNow! both provide the pshufw instruction.
>>> ARM iwMMXt also has an equivalent instruction, as do the Loongson
>>> Multimedia Instructions.
>>> We can simplify the expand_alpha, expand_alpha_rev, and invert_colors
>>> functions down to this single instruction.
>>> The SSE intrinsics provide _mm_shuffle_pi16, but there aren't 3DNow!
>>> intrinsics (to my knowledge). This will require a bit of work to
>>> configure.ac, which I haven't done yet.
>>> I'm interested in hearing some opinions on using Extended MMX
>> It looks like we already require the "MMX_EXTENSIONS" flag in
>> pixman-cpu.c in order to use the MMX implementation, so I can't see any
>> reason to not just use these instructions without any ifdefs etc
> Actually, I remember an issue with these instructions. The problem is
> that to get gcc to accept them on x86, pixman-mmx.c would have to be
> compiled with -msse. Unfortunately, this caused gcc to generate
> SSE-but-not-3DNow! instructions that then caused the original OLPC to
I'll check into that. I have someone who is going to test the patch
(as-is) on an XO-1 (3DNow but no SSE), so we'll see if this is still
I grepped through the disassembly of pixman-mmx.o and didn't see any
SSE/3DNow instructions with or without the patch (with the exception
of 95 pshufw instructions after).
> It may be that we can get around this problem by using -m3dnow instead
> and hope that this won't cause gcc to generate the floating point
> instructions that were also part of 3DNow!, but not available for SSE.
> If it *does* generate such instructions, maybe we should just skip MMX
> for regular PCs. It's not like there are a lot of Pentium IIIs around
Even PIII's have SSE.
I'm 99% sure that the pshufw instruction is identical whether it comes
from SSE or 3DNow.
If we care, we could add a configure flag that enables the use of MMX
Extension instructions at build time. This would allow CPUs with MMX
but without 3DNow/SSE to still use the MMX fast paths. But like you
say, there aren't many of these CPUs left.
More information about the Pixman