[Pixman] Benchmarked: [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

Wed Sep 16 04:25:59 PDT 2015

On Fri,  4 Sep 2015 03:09:20 +0100
Ben Avison <bavison at riscosopen.org> wrote:

> As discussed in
> http://lists.freedesktop.org/archives/pixman/2015-August/003905.html
> 
> the 8 * pixman_fixed_e adjustment which was applied to the transformed
> coordinates is a legacy of rounding errors which used to occur in old
> versions of Pixman, but which no longer apply. For any affine transform,
> you are now guaranteed to get the same result by transforming the upper
> coordinate as though you transform the lower coordinate and add (size-1)
> steps of the increment in source coordinate space. No projective
> transform routines use the COVER_CLIP flags, so they cannot be affected.

Hi all,

as we doing these things not just for cleaning up but with the premise
that there are missed optimization opportunities, I have benchmarked
this patch series.

The series as benchmarked is available at:
https://git.collabora.com/cgit/user/pq/pixman.git/log/?h=cover-benchmark-1

The benchmark points are:

- baseline: "test: Add cover-test v5"

- cleanup: "affine-bench: remove 8e margin from COVER area"
	Includes the 8e extra safety margin removal.

- tight: "pixman-fast-path: Make bilinear cover fetcher use
	COVER_CLIP_TIGHT flag"
	Includes all the COVER_CLIP_BILINEAR related patches from
	Ben.

Note, that ssse3_iters[] in pixman-ssse3.c still contains
FAST_PATH_SAMPLES_COVER_CLIP_BILINEAR.

Cairo version is 1.14.2 for the benchmarks, which are run like:
$ CAIRO_TEST_TARGET=image cairo-perf-trace -r -v -i8 > baseline-image-2.txt

I tried both "image" and "image16" on an x86_64 (Sandybridge), and got
no performance differences in the trimmed-cairo-traces set in either
baseline/cleanup or cleanup/tight.

I also tried with PIXMAN_DISABLE=ssse3 and still got no difference. I
verified I am really running what I think I am by editing Pixman and
seeing the effect in the benchmark.

Am I missing something?

I thought we would see at least some improvements also on x86_64 when
comparing cleanup/tight.

Should I run the same on rpi2? Or is the best effect on the fast paths
we haven't merged yet?

I'd rather not run this on rpi1 due to the function address /
performance quirk, doing the required iterations there would probably
take too long and I'd need to rearrange the result files too.

Or maybe our test set is not enough? I recall having some problems with
that in the past.

So, I patched Pixman to yell whenever TIGHT is set but
COVER_CLIP_BILINEAR is not set. Only t-firefox-canvas-swscroll and
t-firefox-fishtank hit it with source image, each twice per iteration.
Definitely seems like this test set is not hitting the cases we are
interested in. I think I need to dig up our old performance profiles
and see if we could record a trace from a real app that would hit these
cases, now that Cairo's trace recording is supposedly fixed.

The removal of the 8e extra safety margins shouldn't need performance
profiles as justification, but for the tightening patches they'd be
nice to have, especially since the usefulness of them has been
questioned.

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20150916/a30585c9/attachment.sig>