[Pixman] Benchmarked: [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags
Oded Gabbay
oded.gabbay at gmail.com
Sun Sep 20 03:22:29 PDT 2015
On Wed, Sep 16, 2015 at 2:25 PM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Fri, 4 Sep 2015 03:09:20 +0100
> Ben Avison <bavison at riscosopen.org> wrote:
>
>> As discussed in
>> http://lists.freedesktop.org/archives/pixman/2015-August/003905.html
>>
>> the 8 * pixman_fixed_e adjustment which was applied to the transformed
>> coordinates is a legacy of rounding errors which used to occur in old
>> versions of Pixman, but which no longer apply. For any affine transform,
>> you are now guaranteed to get the same result by transforming the upper
>> coordinate as though you transform the lower coordinate and add (size-1)
>> steps of the increment in source coordinate space. No projective
>> transform routines use the COVER_CLIP flags, so they cannot be affected.
>
> Hi all,
>
> as we doing these things not just for cleaning up but with the premise
> that there are missed optimization opportunities, I have benchmarked
> this patch series.
>
> The series as benchmarked is available at:
> https://git.collabora.com/cgit/user/pq/pixman.git/log/?h=cover-benchmark-1
>
> The benchmark points are:
>
> - baseline: "test: Add cover-test v5"
>
> - cleanup: "affine-bench: remove 8e margin from COVER area"
> Includes the 8e extra safety margin removal.
>
> - tight: "pixman-fast-path: Make bilinear cover fetcher use
> COVER_CLIP_TIGHT flag"
> Includes all the COVER_CLIP_BILINEAR related patches from
> Ben.
>
> Note, that ssse3_iters[] in pixman-ssse3.c still contains
> FAST_PATH_SAMPLES_COVER_CLIP_BILINEAR.
>
> Cairo version is 1.14.2 for the benchmarks, which are run like:
> $ CAIRO_TEST_TARGET=image cairo-perf-trace -r -v -i8 > baseline-image-2.txt
>
> I tried both "image" and "image16" on an x86_64 (Sandybridge), and got
> no performance differences in the trimmed-cairo-traces set in either
> baseline/cleanup or cleanup/tight.
>
> I also tried with PIXMAN_DISABLE=ssse3 and still got no difference. I
> verified I am really running what I think I am by editing Pixman and
> seeing the effect in the benchmark.
>
> Am I missing something?
>
> I thought we would see at least some improvements also on x86_64 when
> comparing cleanup/tight.
>
> Should I run the same on rpi2? Or is the best effect on the fast paths
> we haven't merged yet?
>
> I'd rather not run this on rpi1 due to the function address /
> performance quirk, doing the required iterations there would probably
> take too long and I'd need to rearrange the result files too.
>
> Or maybe our test set is not enough? I recall having some problems with
> that in the past.
>
> So, I patched Pixman to yell whenever TIGHT is set but
> COVER_CLIP_BILINEAR is not set. Only t-firefox-canvas-swscroll and
> t-firefox-fishtank hit it with source image, each twice per iteration.
> Definitely seems like this test set is not hitting the cases we are
> interested in. I think I need to dig up our old performance profiles
> and see if we could record a trace from a real app that would hit these
> cases, now that Cairo's trace recording is supposedly fixed.
>
> The removal of the 8e extra safety margins shouldn't need performance
> profiles as justification, but for the tightening patches they'd be
> nice to have, especially since the usefulness of them has been
> questioned.
>
>
> Thanks,
> pq
Hi Pekka, Ben
I decided to also run the cairo trimmed benchmarks on my POWER8
ppc64le and POWER7 ppc64.
To make things clearer, I used the same definitions for "baseline",
"cleanup" and "tight".
I used Cairo version 1.14.3, actually from git with head set to 6f7a9b4
I run the benchmarks doing (it's from inside a script):
"cairo-perf-trace benchmark -r -i8 > ../${__output}.perf"
First of all, diff between baseline/cleanup showed no change, in both
platforms, so that's good :)
Now, for cleanup/tight:
With POWER8 ppc64le, I got the following very modest boost:
image t-firefox-asteroids 483.10 (523.85 3.49%) -> 452.84
(480.34 3.16%): 1.07x speedup
image t-firefox-chalkboard 691.38 (692.09 0.06%) -> 653.07
(654.60 0.26%): 1.06x speedup
However, with POWER7 ppc64, I got the following regressions, which is quite bad:
image t-firefox-asteroids 545.55 (559.64 1.79%) -> 781.07
(791.83 2.33%): 1.43x slowdown
image t-firefox-scrolling 1185.45 (1186.02 0.05%) -> 1748.76
(1754.85 0.20%): 1.48x slowdown
image t-firefox-chalkboard 1444.76 (1464.55 0.88%) -> 2315.76
(2333.10 0.34%): 1.60x slowdown
image t-firefox-paintball 681.43 (682.28 0.10%) -> 1138.15
(1140.19 0.08%): 1.67x slowdown
image t-firefox-canvas 890.14 (890.90 0.10%) -> 1492.83
(1493.51 0.20%): 1.68x slowdown
image t-firefox-canvas-swscroll 1369.94 (1371.66 0.05%) -> 2297.53
(2305.70 0.18%): 1.68x slowdown
image t-xfce4-terminal-a1 829.35 (832.39 0.16%) -> 1392.50
(1414.69 1.08%): 1.68x slowdown
image t-firefox-fishbowl 3112.93 (3114.13 0.02%) -> 5227.18
(5229.05 0.03%): 1.68x slowdown
image t-poppler 404.14 (407.43 0.52%) -> 680.27
(685.01 0.45%): 1.68x slowdown
image t-firefox-particles 3555.75 (3570.29 0.18%) -> 5990.93
(5995.00 0.05%): 1.68x slowdown
image t-midori-zoomed 555.84 (557.29 0.24%) -> 936.56
(937.69 0.08%): 1.68x slowdown
image t-gnome-system-monitor 844.70 (849.98 0.52%) -> 1426.26
(1427.60 0.12%): 1.69x slowdown
image t-firefox-planet-gnome 904.60 (908.31 0.18%) -> 1527.90
(1530.03 0.08%): 1.69x slowdown
image t-chromium-tabs 221.74 (221.87 0.04%) -> 374.75
(376.72 0.26%): 1.69x slowdown
image t-swfdec-youtube 929.86 (930.31 0.12%) -> 1571.61
(1572.76 0.09%): 1.69x slowdown
image t-firefox-fishtank 1787.33 (1787.36 0.00%) -> 3022.38
(3023.47 0.09%): 1.69x slowdown
image t-firefox-canvas-alpha 1026.19 (1030.55 0.24%) -> 1735.63
(1740.84 0.28%): 1.69x slowdown
image t-evolution 431.94 (433.98 0.36%) -> 731.76
(732.26 0.08%): 1.69x slowdown
image t-firefox-talos-svg 1381.38 (1388.40 0.26%) -> 2342.68
(2345.83 0.10%): 1.70x slowdown
image t-gvim 803.40 (806.02 0.29%) -> 1363.80
(1366.63 0.27%): 1.70x slowdown
image t-poppler-reseau 1416.96 (1443.14 0.74%) -> 2408.39
(2412.49 0.16%): 1.70x slowdown
image t-swfdec-giant-steps 827.47 (829.87 0.17%) -> 1407.90
(1410.93 0.18%): 1.70x slowdown
image t-gnome-terminal-vim 663.55 (669.39 0.71%) -> 1132.85
(1139.02 0.29%): 1.71x slowdown
image t-grads-heat-map 225.85 (225.92 0.02%) -> 386.23
(386.78 0.49%): 1.71x slowdown
btw, out of curiosity, I checked cleanup/tight on my Haswell laptop
and I got mixed/bad results:
image t-firefox-canvas 705.79 (869.04 11.16%) -> 563.55
(594.35 2.52%): 1.25x speedup
image t-poppler-reseau 619.46 (881.17 16.35%) -> 657.98
(679.11 7.95%): 1.06x slowdown
image t-firefox-planet-gnome 582.52 (605.63 1.82%) -> 627.80
(634.95 3.31%): 1.08x slowdown
image t-evolution 264.55 (271.81 3.30%) -> 288.95
(336.86 11.37%): 1.09x slowdown
image t-gnome-terminal-vim 264.74 (270.65 0.92%) -> 312.25
(516.79 20.96%): 1.18x slowdown
image t-grads-heat-map 93.61 (93.92 0.23%) -> 115.32
(136.32 10.96%): 1.23x slowdown
image t-chromium-tabs 115.36 (115.94 0.45%) -> 200.87
(254.77 11.90%): 1.74x slowdown
Opinions ?
Oded
More information about the Pixman
mailing list