[Pixman] [PATCH 1/4] vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER

Oded Gabbay oded.gabbay at gmail.com
Mon Sep 7 04:07:19 PDT 2015


On Mon, Sep 7, 2015 at 2:03 PM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Sun,  6 Sep 2015 18:27:08 +0300
> Oded Gabbay <oded.gabbay at gmail.com> wrote:
>
>> This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
>> the functions it calls (combine1, combine4 and
>> core_combine_over_u_pixel_vmx).
>>
>> The optimization is done by removing use of expand_alpha_1x128 and
>> expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
>> pixman_combine32.h.
>>
>> Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
>> 3.4GHz, RHEL 7.2 ppc64le gave the following results:
>>
>> reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)
>>
>>                 Before          After           Change
>>               --------------------------------------------
>> L1              182.05          210.22         +15.47%
>> L2              180.6           208.92         +15.68%
>> M               180.52          208.22         +15.34%
>> HT              130.17          178.97         +37.49%
>> VT              145.82          184.22         +26.33%
>> R               104.51          129.38         +23.80%
>> RT              48.3            61.54          +27.41%
>> Kops/s          430             504            +17.21%
>>
>> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
>> ---
>>  pixman/pixman-vmx.c | 80 ++++++++++++-----------------------------------------
>>  1 file changed, 18 insertions(+), 62 deletions(-)
>>
>> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
>> index a9bd024..d9fc5d6 100644
>> --- a/pixman/pixman-vmx.c
>> +++ b/pixman/pixman-vmx.c
>
>> @@ -646,19 +643,10 @@ static force_inline uint32_t
>>  combine1 (const uint32_t *ps, const uint32_t *pm)
>>  {
>>      uint32_t s = *ps;
>> +    uint32_t a = ALPHA_8(*pm);
>
> pm is dereferenced before checked for NULL.
>
>>
>>      if (pm)
>> -    {
>> -     vector unsigned int ms, mm;
>> -
>> -     mm = unpack_32_1x128 (*pm);
>> -     mm = expand_alpha_1x128 (mm);
>> -
>> -     ms = unpack_32_1x128 (s);
>> -     ms = pix_multiply (ms, mm);
>> -
>> -     s = pack_1x128_32 (ms);
>> -    }
>> +     UN8x4_MUL_UN8(s, a);
>>
>>      return s;
>>  }
>
> Thanks,
> pq

Thanks for catching that!
      Oded


More information about the Pixman mailing list