[Pixman] [PATCH 2/4] Added fast path for "pad" type repeats

Ben Avison bavison at riscosopen.org
Wed Feb 6 05:23:08 PST 2013

>> That's a speedup of 3.86x.
> Impressive. On IVB, I'm only seeing an improvement of 42.6 -> 34.8s
> with the full firefox-chalkboard trace, which is marginally better than
> just implementing PAD support for the simple fetcher. How does that
> compare with the tiled approach?

I suspect it depends upon the source image size and how much padding
there is. With tiled repeats of course, you still have to fetch the
source buffer again for every tile. For pad repeats, just the extreme
left/right pixel needs to be fetched for each row, and the entire repeat
region then only needs to fetch the destination buffer. With large
amounts of horizontal padding, as the trimmed firefox-chalkboard trace
uses, that's got to help pipelining.

Maybe the full firefox-chalkboard trace uses a lower proportion of pad
repeats with large horizontal extents?

> To handle the fallback case, I think we probably want another
> implmentation level between general and fast (tiled?) to clarify that
> these kernels behave differently and that only the general routine would
> match otherwise. Hence making the fallback to general clearer, and
> possibly even export the general_composite_rect for use by the tiled
> implementation.

I wasn't really sure what the best approach was here, I admit, since
there's no precedent for restarting the fast path search like that.
Normally, the various format/flags in the fast path table are sufficient
on their own to determine if a fast path is applicable, but I found
additional tests were necessary (I'm not entirely sure how we get away
without them in the tiled repeat case, though I admit I haven't looked
too closely, so maybe it's just a case that hasn't been exercised by the
test suite?)

On the one hand, it would indeed be possible to export
general_composite_rect() directly. At the other extreme,
_pixman_implementation_lookup_composite() could be made restartable,
though the fast path function pointer isn't, in general, enough to
uniquely identify how far through the fast path tables you were, so we'd
need some way of passing the "const pixman_fast_path_t *info" from the
original lookup to the restarted lookup. This felt a bit messy though,
so I compromised between the two: placed fast_composite_pad_repeat() at
the very end of the pixman-fast-path.c table, where in practice there
aren't any other fast paths before general_composite_rect(). I'm open to
whatever the consensus view is on how this should be handled.


More information about the Pixman mailing list