[Pixman] More MIPS OVER fast paths (over_8888_n_8888, over_8888_n_0565, over_0565_n_0565, over_8888_8_8888, over_8888_8_0565, over_0565_8_0565, over_8888_8888 and over_8888_8888_8888) including OVER combiner.

Lukic, Nemanja nlukic at mips.com
Mon Oct 1 08:29:16 PDT 2012


Hi Soren, Siarhei

Here are results measured for this OVER combiner on couple of OVER fast-paths:
Before adding OVER combiner:
over_8888_8888   =  L1:  95.65  L2:  70.26  M: 13.95 ( 74.24%)  HT: 16.56  VT: 15.96  R: 14.90  RT:  9.05 (  53Kops/s)
over_8888_8_8888 =  L1:  13.62  L2:  11.22  M:  7.57 ( 80.53%)  HT:  6.24  VT:  6.19  R:  6.13  RT:  3.93 (  30Kops/s)
over_8888_8_0565 =  L1:   7.37  L2:   8.30  M:  6.24 ( 58.08%)  HT:  5.46  VT:  5.38  R:  5.26  RT:  3.35 (  27Kops/s)
over_0565_8_8888 =  L1:  10.56  L2:   9.32  M:  7.13 ( 66.42%)  HT:  5.83  VT:  5.79  R:  5.74  RT:  3.60 (  28Kops/s)
over_0565_8_0565 =  L1:   7.82  L2:   7.20  M:  6.09 ( 48.62%)  HT:  5.11  VT:  5.07  R:  4.93  RT:  3.13 (  26Kops/s)

After:
over_8888_8888   =  L1: 163.64  L2:  83.68  M: 17.67 ( 94.15%)  HT: 17.09  VT: 16.60  R: 15.31  RT:  9.60 (  55Kops/s)
over_8888_8_8888 =  L1:  25.98  L2:  22.50  M: 11.54 (122.95%)  HT:  9.94  VT:  9.63  R:  9.20  RT:  5.80 (  38Kops/s)
over_8888_8_0565 =  L1:  14.00  L2:  12.45  M:  8.77 ( 81.79%)  HT:  6.99  VT:  6.89  R:  6.72  RT:  3.95 (  30Kops/s)
over_0565_8_8888 =  L1:  16.75  L2:  14.82  M: 10.06 ( 93.83%)  HT:  7.98  VT:  7.79  R:  7.48  RT:  4.22 (  31Kops/s)
over_0565_8_0565 =  L1:  10.76  L2:   9.69  M:  7.86 ( 62.79%)  HT:  6.18  VT:  6.11  R:  5.97  RT:  3.48 (  28Kops/s)

Thanks,
Nemanja Lukic

-----Original Message-----
From: Søren Sandmann [mailto:sandmann at cs.au.dk] 
Sent: Tuesday, September 25, 2012 6:23 AM
To: Lukic, Nemanja
Cc: pixman at lists.freedesktop.org
Subject: Re: [Pixman] More MIPS OVER fast paths (over_8888_n_8888, over_8888_n_0565, over_0565_n_0565, over_8888_8_8888, over_8888_8_0565, over_0565_8_0565, over_8888_8888 and over_8888_8888_8888) including OVER combiner.

Nemanja Lukic <nlukic at mips.com> writes:

> Added optimizations for several OVER fast paths:
>  - over_8888_n_8888
>  - over_8888_n_0565
>  - over_0565_n_0565
>  - over_8888_8_8888
>  - over_8888_8_0565
>  - over_0565_8_0565
>  - over_8888_8888
>  - over_8888_8888_8888
> Including OVER combiner.
> Per previous code review:
>  - Previously pushed single big commit is now divided into 4 smaller pieces.

Thanks for the patches. I have pushed them to master with a few
formatting fixes.

However, you should get a freedesktop account so that you can push
patches yourself, or at least, if you want me to merge them, provide a
public git repository that can be pulled from.

>  - Added OVER combiner.

Did you do any measurements of this one? As Siarhei said:

    As for the performance numbers. I wonder how much faster would these
    new specialized MIPS fast paths be if we had a DSPr2 optimized OVER
    combiner? You can check "sse2_combine_over_u" and
    "neon_combine_over_u" functions as examples of existing combiners.


Søren


More information about the Pixman mailing list