[Pixman] MIPS over_n_8_8888, over_n_8_0565 and bilinear over_8888_8_8888 fast paths.

Tue May 8 11:11:58 PDT 2012

Hi Siarhei,

I did notice in my tests for the previous set of patches that, after recent pixman commits, MIPS support got broken.
In order to successfully build, I had to add --disable-loongson to the autogen line. With that, everything compiles fine.
Problem is in MIPS Loongson support in the pixman-mmx module. And you are right, is should be fixed.

Concerning composite test (as part of the make check), it revealed a hidden bug in the pixman_composite_over_n_8888_8888_ca_asm_mips fast path.
Registers containing srca, and loaded mask values, got mixed up, so instead of doing:
ma = ~ma;
Src value was inverted, which is wrong. I fixed this. I'll push new patch that fixes this regression. 

Thanks,
Nemanja Lukic

-----Original Message-----
From: Siarhei Siamashka [mailto:siarhei.siamashka at gmail.com] 
Sent: Tuesday, May 08, 2012 1:35 AM
To: Lukic, Nemanja
Cc: pixman at lists.freedesktop.org
Subject: Re: [Pixman] MIPS over_n_8_8888, over_n_8_0565 and bilinear over_8888_8_8888 fast paths.

On Thu, May 3, 2012 at 1:03 AM, Nemanja Lukic <nlukic at mips.com> wrote:
> Added optimizations for over_n_8_8888, over_n_8_0565 and bilinear over_8888_8_8888 routines.
> Benchmark results (lowlevel-blt-bench and cairo-perf-trace) on Malta board (@1Ghz) are included in the log message.
> Per previous code review:
>  - When (srca == 0xff && mask == 0xff) 8888 to 0565 conversion is done only once.
> Any comments to these patches are welcome.

Not directly a comment about these patches, but about the MIPS support
in pixman in general. Right now even without applying your patches,
trying to build pixman with CFLAGS="-O2 -march=mips32r2" results in
the following:

$ export CFLAGS="-O2 -march=mips32r2"
$ ./autogen.sh --host=mipsel-unknown-linux-gnu && make

[snip]

checking whether to use Loongson MMI... yes
checking whether to use MMX intrinsics... no
checking whether to use SSE2 intrinsics... no
checking whether to use VMX/Altivec intrinsics... no
checking whether to use ARM SIMD assembler... no
checking whether to use ARM NEON assembler... no
checking whether to use ARM IWMMXT intrinsics... no
checking whether to use MIPS DSPr2 assembler... yes
checking whether to use GNU-style inline assembler... yes

[snip]

  CCLD   libpixman-mips-dspr2.la
  CC     libpixman_loongson_mmi_la-pixman-mmx.lo
Assembler messages:
Warning: A different -march was already specified, is now mips32r2
/tmp/ccZAVipl.s:50: Error: opcode not supported on this processor:
mips32r2 (mips32r2) `punpcklbh $f0,$f0,$f2'
/tmp/ccZAVipl.s:60: Error: opcode not supported on this processor:
mips32r2 (mips32r2) `punpcklbh $f12,$f12,$f2'
/tmp/ccZAVipl.s:64: Error: opcode not supported on this processor:
mips32r2 (mips32r2) `pshufh $f12,$f12,$f6'
/tmp/ccZAVipl.s:68: Error: opcode not supported on this processor:
mips32r2 (mips32r2) `pmullh $f0,$f0,$f12'
/tmp/ccZAVipl.s:72: Error: opcode not supported on this processor:
mips32r2 (mips32r2) `paddush $f0,$f0,$f10'

I wonder if you have seen this problem in your tests? If yes, then it
would have made sense to escalate it. So that your MIPS Loongson
compatriots can fix the regressions they have unintentionally
introduced ;)

Another problem shows up in the test suite (again with and without
your new patches):

$ make check

PASS: a1-trap-test
PASS: pdf-op-test
PASS: region-test
PASS: region-translate-test
PASS: fetch-test
PASS: oob-test
PASS: trap-crasher
PASS: alpha-loop
PASS: scaling-crash-test
PASS: scaling-helpers-test
PASS: gradient-crash-test
region_contains test passed (checksum=D2BF8C73)
PASS: region-contains-test
PASS: alphamap
PASS: stress-test
composite traps test passed (checksum=E3112106)
PASS: composite-traps-test
blitters test passed (checksum=A364B5BF)
PASS: blitters-test
scaling test passed (checksum=80DF1CB2)
PASS: scaling-test
affine test passed (checksum=1EF2175A)
PASS: affine-test
---- Test 3122474 failed ----
Operator:      OVER CA
Source:        a4, 1x1 R
Mask:          a8b8g8r8, 10x10
Destination:   a8b8g8r8, 1x1

               R     G     B     A         Rounded
Source color:  1.000 1.000 1.000 1.000     0.000 0.000 0.000 1.000
Mask color:    1.000 1.000 1.000 1.000     1.000 1.000 1.000 1.000
Dest. color:   1.000 1.000 1.000 1.000     1.000 1.000 1.000 1.000
Expected:      0.000 0.000 0.000 1.000
Got:               0   255   255   255  [pixel: 0xffffff00]
Min accepted:      0     0     0   254
Max accepted:      1     1     1   256
Test 0x002FA52A failed.
FAIL: composite
=============================================
1 of 19 tests failed
Please report to pixman at lists.freedesktop.org
=============================================
make[2]: *** [check-TESTS] Error 1
make[2]: Leaving directory `/root/pixman/test'
make[1]: *** [check-am] Error 2
make[1]: Leaving directory `/root/pixman/test'
make: *** [check-recursive] Error 1

real	100m39.565s
user	53m23.507s
sys	47m12.957s

Looks like now I'm to blame for not spotting this problem earlier. But
still some fix is needed. MIPS optimizations are going to be one of
the features of the upcoming stable pixman-0.26 release and maybe will
attract some attention. Let's try to make sure that this code is
really good.

I'll post some more comments about the new MIPS patches a bit later.

-- 
Best regards,
Siarhei Siamashka