[Pixman] [PATCH 1/4] bits: Implement PAD support in the simple fetcher
Ben Avison
bavison at riscosopen.org
Wed Jan 30 15:44:29 PST 2013
On Wed, 30 Jan 2013 19:34:58 -0000, Søren Sandmann <sandmann at cs.au.dk> wrote:
> It's simply that the speedup you got:
>
> SNB i5-2500s: firefox-chalkboard 25.9s -> 19.6s: 1.32x speedup
>
> is rather large for a change that doesn't even introduce a fast path or
> use SIMD. That suggests either that something dumb is going on in pixman
> or in the benchmark, or that we want SIMD variants of this operation.
By chance, I happened to be looking at this today. I'm finding a whopping
15% of the runtime for the whole of cairo-perf-trace is in a single type
of composite operation in firefox-chalkboard, and it's an over_8888_8888
which can't be matched by STD_FAST_PATH, SIMPLE_NEAREST_FAST_PATH or
SIMPLE_BILINEAR_FAST_PATH. STD_FAST_PATH fails because
FAST_PATH_SAMPLES_COVER_CLIP_NEAREST isn't set in the source flags, and
SIMPLE_NEAREST_FAST_PATH fails because FAST_PATH_SCALE_TRANSFORM isn't
set. (If you're curious, the precise source flags are 0x207ca77.)
Obviously this means that in this case, we're not getting the benefits of
any platform-specific fast paths. Perhaps what we need is a "pad"
equivalent of fast_composite_tiled_repeat()?
Ben
More information about the Pixman
mailing list