[Pixman] [PATCH 1/4] pixman-fast-path: Add over_n_8888 fast path (disabled)
ppaalanen at gmail.com
Wed Aug 26 01:46:49 PDT 2015
On Tue, 25 Aug 2015 15:45:48 +0300
Oded Gabbay <oded.gabbay at gmail.com> wrote:
> On Mon, Aug 24, 2015 at 7:04 PM, Ben Avison <bavison at riscosopen.org> wrote:
> > On Mon, 24 Aug 2015 15:20:22 +0100, Oded Gabbay <oded.gabbay at gmail.com> wrote:
> >> I tested the patch on POWER8, ppc64le.
> >> make check passes, but when I benchmarked the performance using
> >> lowlevel-blt-bench over_n_8888, I got far worse results than without
> >> this patch (see numbers below).
> >> Apparently, the new C fast-path takes precedence over some vmx combine
> >> fast-paths, thus making performance worse instead of better.
> > If that's true, it wouldn't be the first time that this issue has arisen.
> > Pixman is designed to scan all fast paths (i.e. routines which take
> > source, mask and destination images and perform the whole operation
> > themselves) irrespective of how well tuned they are for a given platform,
> > before it starts looking at iter routines (each of which reads or writes
> > only one of source, mask or destination).
> > Previously, in response to my attempt to work around a case where this
> > was happening, Søren wrote:
> >> [...] the longstanding problem that pixman sometimes won't
> >> pick the fastest code for some operations. In this case the general
> >> iter based code will be faster then the C code in pixman-fast-path.c
> >> because the iter code will use assembly fetchers.
> >> As a result you end up with a bunch of partial reimplementations of
> >> general_composite_rect() inside pixman-arm-simd.c.
> >> Maybe we need to admit failure and make general_composite_rect()
> >> available for other implementations to use. Essentially we would
> >> officially provide a way for implementations to say: My iterators are
> >> faster than pixman-fast-path.c for these specific operations, so just
> >> go directly to general_composite_rect().
> >> It's definitely not a pleasant thing to do, but given that nobody is
> >> likely to have the time/skill combination required to do the necessary
> >> redesign of pixman's fast path system, it might still be preferable to
> >> to do this instead of duplicating the code like these patches do.
> >> With a setup like that, we could also fix the same issue for the
> >> bilinear SSSE3 fetchers and possibly other cases.
> > (ref http://lists.freedesktop.org/archives/pixman/2014-October/003457.html)
> > I can't say that any cleaner solution has occurred to me since then.
> I think the more immediate solution, as Soren have suggested on IRC,
> is for me to implement the equivalent fast-path in VMX.
> I see that it is already implemented in mmx, sse2, mips-dspr2 and arm-neon.
> From looking at the C code, I'm guessing that it is fairly simple to implement.
What if we just dropped the over_n_8888 C fast path? And all the rest
of new C fast path patches.
It's clearly controversial to add C fast paths, because it affects all
targets that don't have an asm fast path for the same, and we cannot
tell by just review whether it is going to be for better or (much)
I also find it hard to see what the affected targets would be.
Rpi should not be affected if we drop the C fast path patch,
there is still the asm fast path patch. Right?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 811 bytes
Desc: OpenPGP digital signature
More information about the Pixman