[Pixman] [PATCH 1/4] vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER

Pekka Paalanen ppaalanen at gmail.com
Mon Sep 7 04:03:38 PDT 2015


On Sun,  6 Sep 2015 18:27:08 +0300
Oded Gabbay <oded.gabbay at gmail.com> wrote:

> This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
> the functions it calls (combine1, combine4 and
> core_combine_over_u_pixel_vmx).
> 
> The optimization is done by removing use of expand_alpha_1x128 and
> expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
> pixman_combine32.h.
> 
> Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
> 3.4GHz, RHEL 7.2 ppc64le gave the following results:
> 
> reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)
> 
>                 Before          After           Change
>               --------------------------------------------
> L1              182.05          210.22         +15.47%
> L2              180.6           208.92         +15.68%
> M               180.52          208.22         +15.34%
> HT              130.17          178.97         +37.49%
> VT              145.82          184.22         +26.33%
> R               104.51          129.38         +23.80%
> RT              48.3            61.54          +27.41%
> Kops/s          430             504            +17.21%
> 
> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
> ---
>  pixman/pixman-vmx.c | 80 ++++++++++++-----------------------------------------
>  1 file changed, 18 insertions(+), 62 deletions(-)
> 
> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
> index a9bd024..d9fc5d6 100644
> --- a/pixman/pixman-vmx.c
> +++ b/pixman/pixman-vmx.c

> @@ -646,19 +643,10 @@ static force_inline uint32_t
>  combine1 (const uint32_t *ps, const uint32_t *pm)
>  {
>      uint32_t s = *ps;
> +    uint32_t a = ALPHA_8(*pm);

pm is dereferenced before checked for NULL.

>  
>      if (pm)
> -    {
> -	vector unsigned int ms, mm;
> -
> -	mm = unpack_32_1x128 (*pm);
> -	mm = expand_alpha_1x128 (mm);
> -
> -	ms = unpack_32_1x128 (s);
> -	ms = pix_multiply (ms, mm);
> -
> -	s = pack_1x128_32 (ms);
> -    }
> +	UN8x4_MUL_UN8(s, a);
>  
>      return s;
>  }

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20150907/0376acd8/attachment.sig>


More information about the Pixman mailing list