[Pixman] [PATCH 2/4] Added fast path for "pad" type repeats
Chris Wilson
chris at chris-wilson.co.uk
Wed Feb 6 02:05:08 PST 2013
On Wed, Feb 06, 2013 at 12:39:13AM +0000, Ben Avison wrote:
> Similar in concept to fast_composite_tiled_repeat(), this breaks up any
> unscaled composites, where source/mask areas outside the bitmap grid are
> not clipped, into a series of simpler composites (either bitmap to bitmap
> or solid to bitmap). These simpler composites are usually likely to match
> existing fast path implementations, and so should benefit all platforms.
>
> This produces some significant speedups for some cairo-perf-trace tests.
> For example, timings on ARMv6 (using Siarhei's trimmed traces) are
>
> Before:
> [ # ] backend test min(s) median(s) stddev. count
> [ # ] image: pixman 0.29.3
> [ 0] image t-firefox-chalkboard 35.715 35.736 0.03% 6/6
>
> After:
> [ # ] backend test min(s) median(s) stddev. count
> [ # ] image: pixman 0.29.3
> [ 0] image t-firefox-chalkboard 9.254 9.261 0.15% 6/6
>
> That's a speedup of 3.86x.
Impressive. On IVB, I'm only seeing an improvement of 42.6 -> 34.8s
with the full firefox-chalkboard trace, which is marginally better than
just implementing PAD support for the simple fetcher. How does that
compare with the tiled approach?
To handle the fallback case, I think we probably want another
implmentation level between general and fast (tiled?) to clarify that
these kernels behave differently and that only the general routine would
match otherwise. Hence making the fallback to general clearer, and
possibly even export the general_composite_rect for use by the tiled
implementation.
> Also added a simple test program to check different repeat types.
This is a nice test and would serve as a good precursor to this patch.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Pixman
mailing list