[Pixman] [PATCH 3/3] Add an iterator that can fetch bilinearly scaled images
Søren Sandmann
sandmann at cs.au.dk
Thu Aug 1 03:42:56 PDT 2013
Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>> The graph is also available here:
>>
>> http://i.imgur.com/Nibcyes.png
>>
>> The data used to generate the graph is available here:
>>
>> http://people.freedesktop.org/~sandmann/before
>> http://people.freedesktop.org/~sandmann/after
>
> The results definitely look good (as expected). But what kind of
> hardware and the compiler has been used for testing?
The CPU was a Core i7-3250M with the clock frequency fixed at 2GHz. The
compiler was GCC 4.4.7 (for various reasons I was running this on RHEL
6).
> On MIPS 74K 480MHz and using gcc 4.5.3, the results look like this:
>
> === Old C ==
> pixman: Disabled mips-dspr2 implementation
> 0.100000 : 320 240 => 32 24 : 0.001046 : 13.619040
> 0.300000 : 320 240 => 96 72 : 0.003226 : 4.667306
> 0.600000 : 320 240 => 192 144 : 0.010374 : 3.752109
> 0.900000 : 320 240 => 288 216 : 0.023395 : 3.760780
> 1.500000 : 320 240 => 480 360 : 0.062911 : 3.640671
> 2.000000 : 320 240 => 640 480 : 0.111207 : 3.620020
> 5.000000 : 320 240 => 1600 1200 : 0.693240 : 3.610625
> 10.000000 : 320 240 => 3200 2400 : 2.906125 : 3.784017
>
> === New C ==
> pixman: Disabled mips-dspr2 implementation
> 0.100000 : 320 240 => 32 24 : 0.001290 : 16.797955
> 0.300000 : 320 240 => 96 72 : 0.005225 : 7.559235
> 0.600000 : 320 240 => 192 144 : 0.016348 : 5.912864
> 0.900000 : 320 240 => 288 216 : 0.034019 : 5.468588
> 1.500000 : 320 240 => 480 360 : 0.082273 : 4.761169
> 2.000000 : 320 240 => 640 480 : 0.140271 : 4.566112
> 5.000000 : 320 240 => 1600 1200 : 0.890944 : 4.640333
> 10.000000 : 320 240 => 3200 2400 : 3.416600 : 4.448698
>
> === DSPr2 assembly ==
> 0.100000 : 320 240 => 32 24 : 0.000610 : 7.941077
> 0.300000 : 320 240 => 96 72 : 0.002555 : 3.696661
> 0.600000 : 320 240 => 192 144 : 0.007881 : 2.850537
> 0.900000 : 320 240 => 288 216 : 0.019378 : 3.115063
> 1.500000 : 320 240 => 480 360 : 0.049227 : 2.848785
> 2.000000 : 320 240 => 640 480 : 0.096938 : 3.155530
> 5.000000 : 320 240 => 1600 1200 : 0.590205 : 3.073984
> 10.000000 : 320 240 => 3200 2400 : 1.929740 : 2.512682
>
> Here the new C code appears to be slower for all scaling ratios.
> I suspect that a possible reason is the use of 64-bit arithmetic
> operations on a 32-bit processor.
>
> The performance of C code is generally more important for less
> common CPU architectures, because they are usually lacking special
> optimizations.
Yeah, a slowdown like that is not really acceptable considering that x86
will rarely benefit from this code. I'll try to make a version with 32
bit arithmetic.
> I'll try to additionally run some more benchmarks. Also on a 64-bit
> Cell PPU from Playstation3 and other devices.
>
> And a minor nitpick about the benchmark program. It would be a bit
> nicer if it could align the results vertically and also produce more
> human readable output (which could be unambiguously interpreted). For
> example, without knowing that the last column is listing the time
> spent per pixel, the people reading this e-mail could not easily
> tell, which results are better.
For the vertical alignemnt, you mean making the columns a fixed size so
that they line up? I can do that and also add some titles, but short of
having it produce the graph (or perhaps do a polynomial regression) it's
not going to be easy to get output that can be compared at a glance.
Søren
More information about the Pixman
mailing list