[Pixman] Selectively disabling fast paths
Ben Avison
bavison at riscosopen.org
Fri Aug 28 11:00:51 PDT 2015
On Fri, 28 Aug 2015 13:43:06 +0100, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Thu, 27 Aug 2015 17:20:26 +0100
> "Ben Avison" <bavison at riscosopen.org> wrote:
>> One thing it wouldn't be able to detect, though, would be where the fetch/
>> combine/writeback iterators are faster than fast paths for the *same*
>> implementation level - such as with the ARMv6 nearest-scaled patches I
>> was revisiting recently. In that specific case, it turned out that my
>> original solution of bespoke C wrappers for the fetchers turned out to be
>> even faster - but we don't have any way at present of detecting if there
>> are other cases where we would be better off deleting the fast paths and
>> letting the iterators do the work instead.
>
> Sorry, but I'm a bit hazy on the details here. Based on the
> discussions, I have developed the following mental model:
>
> 1. asm fast paths (whole operation)
> 2. C fast paths (whole operation)
> 3. _general_composite_rect (fetch/combine/writeback; iterators)
> - asm implementation or
> - C implementation for each
Yes, that's pretty much it, except that some platforms have multiple
levels of asm fast paths, and some or all of those will be enabled
depending upon the CPU features detected via a combination of compile-
time and runtime tests.
Basically, there is a chain of pixman_implementation_t structs, in
decreasing priority order (that's why you'll see the name
"implementation" used to refer to a set of routines tuned for a
particular instruction set). Each implementation contains a table of
pixman_fast_path_t structs (which we refer to as "fast paths") and a
table of pixman_iter_info_t structs (the fetcher and writeback iterators)
and an array of combiner routines and a few other bits and pieces.
For example, on an ARMv7 platform, you'll normally find the following
implementations are enabled, in decreasing priority order:
pixman-noop.c (can't be disabled)
pixman-arm-neon.c (unless PIXMAN_DISABLE contains "arm-neon")
pixman-arm-simd.c (unless PIXMAN_DISABLE contains "arm-simd")
pixman-fast-path.c (unless PIXMAN_DISABLE contains "fast")
pixman-general.c (can't be disabled; also references last-resort
functions in pixman-bits-image.c / pixman-*-gradient.c /
pixman-combine32.c / pixman-combine-float.c)
When you call pixman_image_composite(), it scans through the fast paths
from each implementation in order, looking for one which matches the
criteria in the fast path tables. pixman-general.c contains a single fast
path, which is universally applicable, and therefore handles anything
that wasn't caught by higher implementations - and it uses the function
general_composite_rect(). In turn, general_composite_rect scans the
implementations in order, looking for fetchers, combiners and writeback
function which will allow it to perform the requested operation line by
line, stage by stage.
When you set PIXMAN_DISABLE, you knock out the whole of an
implementation, both its fast paths and its iterators/combiners.
The point I was trying to make (badly, it seems) is that iterators/
combiners are relatively widely applicable, and are chosen at lower
priority than all the fast paths, but because they were developed
relatively recently, many of the fast paths have never had their
performance compared against the iterators/combiners to see if their
inclusion is perhaps no longer warranted since the iterators/combiners
were added.
> Maybe we could fix that by introducing a PIXMAN_DISABLE=wholeop or
> similar, that would disable all whole operation fast paths, but leave
> the iterator paths untouched?
>
> Should I do that, would it be worth it?
It could probably be done in _pixman_implementation_create(), as long as
_pixman_implementation_create_general() explicitly initialises
imp->fast_paths so that at least general_composite_rect() always ends up
on the chain of fast paths.
Ben
More information about the Pixman
mailing list