[Pixman] Selectively disabling fast paths
ppaalanen at gmail.com
Tue Sep 1 07:03:31 PDT 2015
On Fri, 28 Aug 2015 19:00:51 +0100
"Ben Avison" <bavison at riscosopen.org> wrote:
> On Fri, 28 Aug 2015 13:43:06 +0100, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> > On Thu, 27 Aug 2015 17:20:26 +0100
> > "Ben Avison" <bavison at riscosopen.org> wrote:
> >> One thing it wouldn't be able to detect, though, would be where the fetch/
> >> combine/writeback iterators are faster than fast paths for the *same*
> >> implementation level - such as with the ARMv6 nearest-scaled patches I
> >> was revisiting recently. In that specific case, it turned out that my
> >> original solution of bespoke C wrappers for the fetchers turned out to be
> >> even faster - but we don't have any way at present of detecting if there
> >> are other cases where we would be better off deleting the fast paths and
> >> letting the iterators do the work instead.
> > Sorry, but I'm a bit hazy on the details here. Based on the
> > discussions, I have developed the following mental model:
> > 1. asm fast paths (whole operation)
> > 2. C fast paths (whole operation)
> > 3. _general_composite_rect (fetch/combine/writeback; iterators)
> > - asm implementation or
> > - C implementation for each
> Yes, that's pretty much it, except that some platforms have multiple
> levels of asm fast paths, and some or all of those will be enabled
> depending upon the CPU features detected via a combination of compile-
> time and runtime tests.
> Basically, there is a chain of pixman_implementation_t structs, in
> decreasing priority order (that's why you'll see the name
> "implementation" used to refer to a set of routines tuned for a
> particular instruction set). Each implementation contains a table of
> pixman_fast_path_t structs (which we refer to as "fast paths") and a
> table of pixman_iter_info_t structs (the fetcher and writeback iterators)
> and an array of combiner routines and a few other bits and pieces.
> For example, on an ARMv7 platform, you'll normally find the following
> implementations are enabled, in decreasing priority order:
> pixman-noop.c (can't be disabled)
> pixman-arm-neon.c (unless PIXMAN_DISABLE contains "arm-neon")
> pixman-arm-simd.c (unless PIXMAN_DISABLE contains "arm-simd")
> pixman-fast-path.c (unless PIXMAN_DISABLE contains "fast")
> pixman-general.c (can't be disabled; also references last-resort
> functions in pixman-bits-image.c / pixman-*-gradient.c /
> pixman-combine32.c / pixman-combine-float.c)
> When you call pixman_image_composite(), it scans through the fast paths
> from each implementation in order, looking for one which matches the
> criteria in the fast path tables. pixman-general.c contains a single fast
> path, which is universally applicable, and therefore handles anything
> that wasn't caught by higher implementations - and it uses the function
> general_composite_rect(). In turn, general_composite_rect scans the
> implementations in order, looking for fetchers, combiners and writeback
> function which will allow it to perform the requested operation line by
> line, stage by stage.
> When you set PIXMAN_DISABLE, you knock out the whole of an
> implementation, both its fast paths and its iterators/combiners.
> The point I was trying to make (badly, it seems) is that iterators/
> combiners are relatively widely applicable, and are chosen at lower
> priority than all the fast paths, but because they were developed
> relatively recently, many of the fast paths have never had their
> performance compared against the iterators/combiners to see if their
> inclusion is perhaps no longer warranted since the iterators/combiners
> were added.
Thank you for the excellent explanation. I'm going to bookmark this, in
case anyone else asks. :-)
> > Maybe we could fix that by introducing a PIXMAN_DISABLE=wholeop or
> > similar, that would disable all whole operation fast paths, but leave
> > the iterator paths untouched?
> > Should I do that, would it be worth it?
> It could probably be done in _pixman_implementation_create(), as long as
> _pixman_implementation_create_general() explicitly initialises
> imp->fast_paths so that at least general_composite_rect() always ends up
> on the chain of fast paths.
Ok, I'll keep that in mind in case we want to test things.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 811 bytes
Desc: OpenPGP digital signature
More information about the Pixman