[Pixman] [PATCH] mmx: Use MMX2 intrinsics from xmmintrin.h directly.
siarhei.siamashka at gmail.com
Sun Oct 25 17:10:39 PDT 2015
On Sun, 25 Oct 2015 13:13:09 -0700
Matt Turner <mattst88 at gmail.com> wrote:
> On Sun, Oct 11, 2015 at 8:59 PM, Matt Turner <mattst88 at gmail.com> wrote:
> > We had lots of hacks to handle the inability to include xmmintrin.h
> > without compiling with -msse (lest SSE instructions be used in
> > pixman-mmx.c). Some recent version of gcc relaxed this restriction.
> > Change configure.ac to test that xmmintrin.h can be included and that we
> > can use some intrinsics from it, and remove the work-around code from
> > pixman-mmx.c.
> > Evidently allows gcc 4.9.3 to optimize better as well:
> > text data bss dec hex filename
> > 657078 30848 680 688606 a81de libpixman-1.so.0.33.3 before
> > 656710 30848 680 688238 a806e libpixman-1.so.0.33.3 after
> > Signed-off-by: Matt Turner <mattst88 at gmail.com>
> > ---
> Ugh. This is apparently not sufficient...
> GCC allows you to *include* xmmintrin.h without enabling SSE, but it
> still doesn't allow you to use any of the functions:
> conftest.c: In function ‘main’:
> error: inlining failed in call to always_inline ‘_mm_mulhi_pu16’:
> target specific option mismatch
> _mm_mulhi_pu16 (__m64 __A, __m64 __B)
> conftest.c:12:7: error: called from here
> w = _mm_mulhi_pu16(w, w);
Oh, looks like the restriction used to be relaxed for a while, but then
GCC 4.9 started to be strict again:
> I'm not sure what to do except to revert.
The real problem is that GCC does not provide a separate option for
MMX2 (a common subset of 3DNOW and SSE). We usually solve compiler
problems by reporting bugs to compiler developers. This particular
case had not been handled according to the usual rule, and now
we have a nice practical demonstration of the consequences ;-)
BTW, we can still report a bug to GCC. Better late than never.
> The MMX but no SSE case is important, at least it was in the past
> because of OLPC's XO-1.
I'm not sure how many OLPC XO-1 laptops might be still remaining in
real use in the hands of real people:
> Suggestions besides reverting this?
Because OLPC XO-1 is using the AMD Geode processor, we could probably
treat the code in pixman-mmx.c as 3dnow optimizations on x86 hardware?
Another option is to start using assembly instead of intrinsics.
Unless a miracle happens and somebody decides to pay for this job,
we definitely don't have resources to do a high quality assembly
implementation for MMX/MMX2. But we still can take the assembly
output of GCC and tweak it a bit. This is ugly and not very
maintainable though. Been there, done that with ARMv6.
Or we could simply do nothing and finally retire MMX support on x86.
If OLPC XO-1 users still do exist, they can always contact us.
More information about the Pixman