[Pixman] [PATCH 0/4] New fast paths and Raspberry Pi 1 benchmarking
spitzak at gmail.com
Thu Aug 20 11:34:37 PDT 2015
On Thu, Aug 20, 2015 at 6:58 AM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> A thing that explains a great deal of these anomalies, but not all of it,
> something to do with function addresses. There are hypotheses that it might
> have to do with the branch predictor and its cache. We made a test
> exactly that idea: pick a fast path function that seems to be most
> to unexpected changes, pad it with x nops before the function start and N-x
> nops after the function end. We never execute those nops, but changing x
> changes the function start address while keeping everything else in the
> binary in the same place.
> The results were mind-boggling: depending on the function starting
> address, the
> src_8888_8888 L1 test of lowlevel-blt-bench went either 355 Mpx/s or 470
> There does not seem to be any predictable pattern on which addresses are
> and which are "slow". Obviously this will screw up our benchmarks, because
> change in an unrelated function may cause another function's address to
> and therefore change its performance. See  for the plot.
>  The plot of alignment vs. performance
Could this be whether some "bad" instruction ends up next to or split by a
cache line boundary? That would produce a random-looking plot, though it
really is a plot of the location of the bad instructions in the measured
If this really is a problem then the ideal fix is for the compiler to
insert NOP instructions in order to move the bad instructions away from the
locations that make them bad. Yike.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Pixman