[Pixman] More MIPS OVER fast paths (over_8888_n_8888, over_8888_n_0565, over_0565_n_0565, over_8888_8_8888, over_8888_8_0565, over_0565_8_0565, over_8888_8888 and over_8888_8888_8888) including OVER combiner.
Lukic, Nemanja
nlukic at mips.com
Mon Oct 1 08:29:16 PDT 2012
Hi Soren, Siarhei
Here are results measured for this OVER combiner on couple of OVER fast-paths:
Before adding OVER combiner:
over_8888_8888 = L1: 95.65 L2: 70.26 M: 13.95 ( 74.24%) HT: 16.56 VT: 15.96 R: 14.90 RT: 9.05 ( 53Kops/s)
over_8888_8_8888 = L1: 13.62 L2: 11.22 M: 7.57 ( 80.53%) HT: 6.24 VT: 6.19 R: 6.13 RT: 3.93 ( 30Kops/s)
over_8888_8_0565 = L1: 7.37 L2: 8.30 M: 6.24 ( 58.08%) HT: 5.46 VT: 5.38 R: 5.26 RT: 3.35 ( 27Kops/s)
over_0565_8_8888 = L1: 10.56 L2: 9.32 M: 7.13 ( 66.42%) HT: 5.83 VT: 5.79 R: 5.74 RT: 3.60 ( 28Kops/s)
over_0565_8_0565 = L1: 7.82 L2: 7.20 M: 6.09 ( 48.62%) HT: 5.11 VT: 5.07 R: 4.93 RT: 3.13 ( 26Kops/s)
After:
over_8888_8888 = L1: 163.64 L2: 83.68 M: 17.67 ( 94.15%) HT: 17.09 VT: 16.60 R: 15.31 RT: 9.60 ( 55Kops/s)
over_8888_8_8888 = L1: 25.98 L2: 22.50 M: 11.54 (122.95%) HT: 9.94 VT: 9.63 R: 9.20 RT: 5.80 ( 38Kops/s)
over_8888_8_0565 = L1: 14.00 L2: 12.45 M: 8.77 ( 81.79%) HT: 6.99 VT: 6.89 R: 6.72 RT: 3.95 ( 30Kops/s)
over_0565_8_8888 = L1: 16.75 L2: 14.82 M: 10.06 ( 93.83%) HT: 7.98 VT: 7.79 R: 7.48 RT: 4.22 ( 31Kops/s)
over_0565_8_0565 = L1: 10.76 L2: 9.69 M: 7.86 ( 62.79%) HT: 6.18 VT: 6.11 R: 5.97 RT: 3.48 ( 28Kops/s)
Thanks,
Nemanja Lukic
-----Original Message-----
From: Søren Sandmann [mailto:sandmann at cs.au.dk]
Sent: Tuesday, September 25, 2012 6:23 AM
To: Lukic, Nemanja
Cc: pixman at lists.freedesktop.org
Subject: Re: [Pixman] More MIPS OVER fast paths (over_8888_n_8888, over_8888_n_0565, over_0565_n_0565, over_8888_8_8888, over_8888_8_0565, over_0565_8_0565, over_8888_8888 and over_8888_8888_8888) including OVER combiner.
Nemanja Lukic <nlukic at mips.com> writes:
> Added optimizations for several OVER fast paths:
> - over_8888_n_8888
> - over_8888_n_0565
> - over_0565_n_0565
> - over_8888_8_8888
> - over_8888_8_0565
> - over_0565_8_0565
> - over_8888_8888
> - over_8888_8888_8888
> Including OVER combiner.
> Per previous code review:
> - Previously pushed single big commit is now divided into 4 smaller pieces.
Thanks for the patches. I have pushed them to master with a few
formatting fixes.
However, you should get a freedesktop account so that you can push
patches yourself, or at least, if you want me to merge them, provide a
public git repository that can be pulled from.
> - Added OVER combiner.
Did you do any measurements of this one? As Siarhei said:
As for the performance numbers. I wonder how much faster would these
new specialized MIPS fast paths be if we had a DSPr2 optimized OVER
combiner? You can check "sse2_combine_over_u" and
"neon_combine_over_u" functions as examples of existing combiners.
Søren
More information about the Pixman
mailing list