[Pixman] [PATCH 0/4] New fast paths and Raspberry Pi 1 benchmarking

Thu Aug 20 11:34:37 PDT 2015

On Thu, Aug 20, 2015 at 6:58 AM, Pekka Paalanen <ppaalanen at gmail.com> wrote:

> A thing that explains a great deal of these anomalies, but not all of it,
> has
> something to do with function addresses. There are hypotheses that it might
> have to do with the branch predictor and its cache. We made a test
> targeting
> exactly that idea: pick a fast path function that seems to be most
> susceptible
> to unexpected changes, pad it with x nops before the function start and N-x
> nops after the function end. We never execute those nops, but changing x
> changes the function start address while keeping everything else in the
> whole
> binary in the same place.
>
> The results were mind-boggling: depending on the function starting
> address, the
> src_8888_8888 L1 test of lowlevel-blt-bench went either 355 Mpx/s or 470
> Mpx/s.
> There does not seem to be any predictable pattern on which addresses are
> "fast"
> and which are "slow". Obviously this will screw up our benchmarks, because
> a
> change in an unrelated function may cause another function's address to
> shift,
> and therefore change its performance. See [1] for the plot.
>
> [1] The plot of alignment vs. performance
>
> https://git.collabora.com/cgit/user/pq/pixman-benchmarking.git/plain/octave/figures/fig-src-8888-8888-L1.pdf
>

Could this be whether some "bad" instruction ends up next to or split by a
cache line boundary? That would produce a random-looking plot, though it
really is a plot of the location of the bad instructions in the measured
function.

If this really is a problem then the ideal fix is for the compiler to
insert NOP instructions in order to move the bad instructions away from the
locations that make them bad. Yike.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20150820/319c833d/attachment.html>