[Pixman] [PATCH 06/12] vmx: implement fast path vmx_composite_over_n_8888_8888_ca
oded.gabbay at gmail.com
Wed Jul 15 05:05:21 PDT 2015
On Wed, Jul 15, 2015 at 2:59 PM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Tue, 14 Jul 2015 11:41:25 +0300
> Siarhei Siamashka <siarhei.siamashka at gmail.com> wrote:
>> On Thu, 2 Jul 2015 13:04:11 +0300
>> Oded Gabbay <oded.gabbay at gmail.com> wrote:
>> > POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
>> > reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)
> *Very* stable memcpy speed, comparing to patch 7. Impressive.
>> > Before After Change
>> > ---------------------------------------------
>> > L1 61.92 244.91 +295.53%
>> > L2 62.74 243.3 +287.79%
>> > M 63.03 241.94 +283.85%
>> > HT 59.91 144.22 +140.73%
>> > VT 59.4 174.39 +193.59%
>> > R 53.6 111.37 +107.78%
>> > RT 37.99 46.38 +22.08%
>> > Kops/s 436 506 +16.06%
>> > cairo trimmed benchmarks :
>> > Speedups
>> > ========
>> > t-xfce4-terminal-a1 1540.37 -> 1226.14 : 1.26x
>> > t-firefox-talos-gfx 1488.59 -> 1209.19 : 1.23x
>> > Slowdowns
>> > =========
>> > t-evolution 553.88 -> 581.63 : 1.05x
>> > t-poppler 364.99 -> 383.79 : 1.05x
>> > t-firefox-scrolling 1223.65 -> 1304.34 : 1.07x
>> > Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
>> Acked-by: Siarhei Siamashka <siarhei.siamashka at gmail.com>
> why are there slowdowns up to 7%?
I don't know, I didn't investigate it.
> Can the cost of adding more entries to the fast path table be that
> much, or is something else going on?
I would rule out this reason, because when I performed the benchmarks,
I did it for each patch separately.
i.e. for each new fast-path, I removed all the previous fast-paths I
added to make sure I'm measuring the current patch. Therefore, the
amount of entries in the fast-path table is the same among all the
patches in this patch-set.
> Or if we don't care about that, why?
I think that the speedups in this specific patch are more substantial
than the slowdowns. If it was the other way around, than I would have
removed this patch, like I did with another patch, which Siarhei
rejected because of it.
> If you have no idea, maybe check the "all" set of lowlevel-blt-bench if
> you can find unrelated operations slowing down for some obscure reason.
Good thinking. I'll try that offline.
> I suppose could also see if adding the same amount of fast path table
> entries that will never match would cause the same slowdowns as this
As I said above, because of the way I tested it, I don't think this is
More information about the Pixman