[cairo] Reduce number of floating point operations
Jorn Baayen
jorn at openedhand.com
Tue Nov 21 05:56:52 PST 2006
Hi,
On Mon, 2006-09-25 at 11:26 -0700, Carl Worth wrote:
> > Unfortunately, I'm very busy at work lately, so I don't have time to
> > work on this patch and won't have it in near future. It would be nice if
> > someone could continue to work on this patch.
>
> OK. So, thanks for the patches. There's definitely lots of useful
> stuff here. I'll see if I can get some time to clean some of it up and
> get it in this week or next, (though, don't let that stop anybody from
> taking a whack at it).
I split out Aivars' patch into 3 patches as suggested and tried to
incorporate the changes you suggested in your review. I'm sending the
first two for review here, along with cairo-perf-diffs run on ARM.
o is-identity.diff replaces the 6 FP operations to check whether
a matrix is an identity matrix with a call to memset().
This is not always faster.
o glyphs-transform.diff extends is-identity.diff with glyph
transformation optimizations.
This is always faster, but less so then is-identity.diff when
it is faster.
The cairo-perf-diff figures are rather surprising, but this does not
seem to be caused by differences in circumstances as re-running
cairo-perf repeatedly results in the same figures.
Thanks,
Jorn
>
> -Carl
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://cairographics.org/cgi-bin/mailman/listinfo/cairo
--
OpenedHand Ltd.
http://o-hand.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: is-identity.diff
Type: text/x-patch
Size: 1136 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20061121/7d83c377/is-identity-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glyphs-transform.diff
Type: text/x-patch
Size: 8072 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20061121/7d83c377/glyphs-transform-0001.bin
-------------- next part --------------
Speedups
========
image-rgba text_similar_rgba_source-64 35.37 1.21% -> 30.28 2.76%: 1.17x speedup
?
image-rgba text_similar_rgba_source-128 68.65 2.03% -> 59.57 1.94%: 1.15x speedup
?
image-rgb text_similar_rgba_over-64 31.85 1.29% -> 27.72 2.39%: 1.15x speedup
?
image-rgba text_image_rgb_over-64 27.22 2.15% -> 23.82 2.04%: 1.14x speedup
?
image-rgba text_image_rgba_over-128 57.00 0.43% -> 50.00 0.80%: 1.14x speedup
?
image-rgb text_solid_rgba_over-64 31.71 1.34% -> 28.23 1.18%: 1.12x speedup
?
image-rgb text_similar_rgb_over-256 221.81 0.54% -> 197.62 0.77%: 1.12x speedup
?
image-rgb text_image_rgb_over-64 31.25 0.66% -> 27.91 0.68%: 1.12x speedup
?
image-rgb text_image_rgba_over-64 31.74 0.43% -> 28.37 0.81%: 1.12x speedup
?
image-rgb text_similar_rgb_over-64 31.36 1.14% -> 28.05 0.43%: 1.12x speedup
?
image-rgb text_image_rgba_over-256 216.93 0.57% -> 194.62 0.83%: 1.11x speedup
?
image-rgb text_solid_rgb_source-64 39.21 0.74% -> 35.22 0.97%: 1.11x speedup
?
image-rgb text_image_rgb_over-128 57.25 0.84% -> 51.64 1.14%: 1.11x speedup
?
image-rgb text_solid_rgba_source-256 283.74 0.33% -> 256.60 1.60%: 1.11x speedup
?
image-rgb text_image_rgba_over-128 57.31 0.70% -> 51.88 1.46%: 1.10x speedup
?
image-rgba text_solid_rgb_over-256 217.21 0.29% -> 196.75 0.76%: 1.10x speedup
?
image-rgba text_image_rgba_source-64 31.48 1.88% -> 28.53 0.83%: 1.10x speedup
?
image-rgba text_solid_rgb_source-256 276.64 0.30% -> 250.76 0.83%: 1.10x speedup
?
image-rgb text_linear_rgba_source-128 73.37 0.64% -> 66.57 2.05%: 1.10x speedup
?
image-rgba text_solid_rgba_source-256 276.17 1.34% -> 250.75 0.64%: 1.10x speedup
?
image-rgba text_image_rgb_source-64 34.67 1.99% -> 31.48 0.91%: 1.10x speedup
?
image-rgb text_linear_rgb_source-256 284.46 0.44% -> 258.30 1.96%: 1.10x speedup
?
image-rgb text_solid_rgba_source-64 39.17 0.99% -> 35.58 1.03%: 1.10x speedup
?
image-rgb text_image_rgb_over-256 220.19 0.33% -> 200.12 0.75%: 1.10x speedup
?
image-rgb text_linear_rgba_over-64 33.12 0.83% -> 30.13 0.90%: 1.10x speedup
?
image-rgb text_solid_rgb_over-256 218.96 0.81% -> 199.24 0.89%: 1.10x speedup
?
image-rgba text_similar_rgba_over-256 218.31 0.15% -> 199.04 0.72%: 1.10x speedup
?
image-rgb text_similar_rgba_over-128 56.98 0.71% -> 52.02 0.34%: 1.10x speedup
?
image-rgb text_similar_rgba_source-64 37.97 0.81% -> 34.71 0.31%: 1.09x speedup
?
xlib-rgb text_image_rgba_over-64 55.52 1.10% -> 50.83 0.96%: 1.09x speedup
?
image-rgb text_similar_rgba_over-256 216.87 0.62% -> 198.71 0.84%: 1.09x speedup
?
image-rgb text_image_rgba_source-64 37.35 1.94% -> 34.24 0.74%: 1.09x speedup
?
image-rgb text_linear_rgba_over-256 239.66 0.16% -> 219.86 1.42%: 1.09x speedup
?
image-rgb text_image_rgb_source-256 276.10 0.30% -> 253.51 1.29%: 1.09x speedup
?
image-rgb text_radial_rgba_source-256 336.39 0.57% -> 309.42 1.53%: 1.09x speedup
?
image-rgb text_image_rgb_source-64 37.97 0.45% -> 34.94 0.58%: 1.09x speedup
?
image-rgb text_linear_rgba_source-256 286.50 0.13% -> 264.12 0.47%: 1.08x speedup
?
image-rgba text_linear_rgb_over-128 61.44 1.15% -> 56.65 1.18%: 1.08x speedup
?
xlib-rgb text_similar_rgba_over-64 49.12 0.92% -> 45.29 0.87%: 1.08x speedup
?
image-rgb text_linear_rgba_source-64 38.50 0.93% -> 35.51 0.75%: 1.08x speedup
?
image-rgb text_linear_rgb_over-256 240.02 0.35% -> 221.59 0.47%: 1.08x speedup
?
xlib-rgb text_similar_rgb_over-64 49.09 0.86% -> 45.32 1.29%: 1.08x speedup
?
image-rgb text_solid_rgba_over-256 218.28 0.40% -> 201.85 0.78%: 1.08x speedup
?
image-rgb text_solid_rgb_source-256 281.42 0.48% -> 260.26 1.07%: 1.08x speedup
?
image-rgb text_similar_rgb_over-128 56.90 0.59% -> 52.63 0.87%: 1.08x speedup
?
image-rgb text_radial_rgb_source-256 336.50 0.36% -> 311.89 0.35%: 1.08x speedup
?
image-rgb text_image_rgb_source-128 71.99 0.49% -> 66.75 1.88%: 1.08x speedup
?
image-rgb text_similar_rgb_source-128 72.28 0.51% -> 67.08 1.04%: 1.08x speedup
?
image-rgba text_image_rgb_over-256 218.91 0.24% -> 203.22 0.34%: 1.08x speedup
?
image-rgb text_solid_rgba_over-128 55.82 0.64% -> 51.85 0.94%: 1.08x speedup
?
xlib-rgba text_image_rgba_source-64 68.36 0.31% -> 63.50 0.70%: 1.08x speedup
?
xlib-rgba text_solid_rgb_over-256 359.27 0.19% -> 333.96 0.60%: 1.08x speedup
?
xlib-rgb text_solid_rgba_over-64 52.70 0.75% -> 48.99 1.37%: 1.08x speedup
?
image-rgb text_image_rgba_source-256 275.54 0.38% -> 256.39 0.77%: 1.07x speedup
?
image-rgb text_image_rgba_source-128 70.61 0.44% -> 65.73 0.85%: 1.07x speedup
?
xlib-rgba text_solid_rgba_over-256 357.54 0.27% -> 332.91 0.31%: 1.07x speedup
?
image-rgba text_radial_rgba_over-64 48.53 1.08% -> 45.20 0.63%: 1.07x speedup
?
xlib-rgb text_image_rgb_over-64 55.16 0.79% -> 51.40 0.84%: 1.07x speedup
?
image-rgba text_linear_rgb_over-256 237.49 0.11% -> 221.34 0.71%: 1.07x speedup
?
image-rgba text_linear_rgb_source-256 281.44 0.36% -> 262.69 0.55%: 1.07x speedup
?
image-rgb text_solid_rgba_source-128 72.82 0.33% -> 68.01 0.87%: 1.07x speedup
?
image-rgba text_linear_rgba_over-256 236.22 0.42% -> 220.80 0.27%: 1.07x speedup
?
image-rgb text_similar_rgba_source-128 71.66 0.66% -> 67.03 1.06%: 1.07x speedup
?
xlib-rgba text_similar_rgb_over-256 329.08 0.18% -> 307.87 0.11%: 1.07x speedup
?
xlib-rgba text_solid_rgba_over-128 93.25 0.47% -> 87.29 0.35%: 1.07x speedup
?
image-rgba text_image_rgba_over-256 215.88 0.69% -> 202.12 0.77%: 1.07x speedup
?
image-rgba text_linear_rgba_source-256 280.03 0.51% -> 262.47 0.57%: 1.07x speedup
?
xlib-rgba text_similar_rgba_over-256 326.18 0.30% -> 305.98 0.36%: 1.07x speedup
?
xlib-rgb text_solid_rgba_over-128 108.22 0.16% -> 101.63 0.79%: 1.06x speedup
?
xlib-rgba text_solid_rgba_source-64 62.48 0.57% -> 58.75 0.55%: 1.06x speedup
?
image-rgba text_similar_rgb_over-256 216.68 0.79% -> 204.09 0.22%: 1.06x speedup
image-rgb text_linear_rgba_over-128 60.86 0.57% -> 57.35 0.38%: 1.06x speedup
xlib-rgba text_similar_rgb_over-128 85.31 0.52% -> 80.42 0.47%: 1.06x speedup
image-rgb text_linear_rgb_over-128 61.00 0.63% -> 57.52 0.38%: 1.06x speedup
xlib-rgba text_solid_rgb_source-256 463.44 0.09% -> 437.17 0.22%: 1.06x speedup
image-rgba text_radial_rgba_source-256 331.09 0.30% -> 312.80 0.54%: 1.06x speedup
xlib-rgba text_linear_rgba_source-64 71.65 0.70% -> 67.71 0.44%: 1.06x speedup
image-rgb text_similar_rgb_source-256 276.20 0.59% -> 261.27 0.35%: 1.06x speedup
image-rgba text_radial_rgb_source-256 331.32 0.43% -> 313.76 0.62%: 1.06x speedup
xlib-rgba text_similar_rgba_source-256 433.08 0.27% -> 410.24 0.56%: 1.06x speedup
image-rgb text_radial_rgba_source-128 85.27 0.30% -> 80.79 1.38%: 1.06x speedup
image-rgb text_linear_rgb_source-128 73.03 0.16% -> 69.25 0.29%: 1.05x speedup
xlib-rgba text_solid_rgba_source-256 461.81 0.48% -> 437.95 0.07%: 1.05x speedup
image-rgba text_solid_rgba_over-256 215.13 0.45% -> 204.06 0.66%: 1.05x speedup
xlib-rgba text_similar_rgba_over-128 84.71 0.49% -> 80.36 0.57%: 1.05x speedup
image-rgb text_radial_rgb_over-64 52.44 0.31% -> 49.79 0.58%: 1.05x speedup
image-rgba text_similar_rgb_source-256 270.74 1.10% -> 257.06 0.19%: 1.05x speedup
xlib-rgb text_solid_rgb_over-256 419.24 0.37% -> 398.12 0.27%: 1.05x speedup
xlib-rgba text_similar_rgb_source-256 435.90 0.24% -> 414.03 0.21%: 1.05x speedup
xlib-rgb text_linear_rgba_source-64 79.89 0.63% -> 75.88 0.76%: 1.05x speedup
xlib-rgb text_solid_rgba_over-256 419.40 0.30% -> 398.44 0.25%: 1.05x speedup
xlib-rgb text_similar_rgba_over-256 389.60 0.21% -> 370.37 0.18%: 1.05x speedup
image-rgb text_radial_rgba_over-64 51.91 0.90% -> 49.38 0.57%: 1.05x speedup
image-rgb text_radial_rgba_source-64 41.02 1.22% -> 39.04 0.36%: 1.05x speedup
Slowdowns
=========
xlib-rgb paint_linear_rgb_source-512 314.67 0.06% -> 368.85 0.11%: 1.17x slowdown
?
xlib-rgb paint_linear_rgba_source-512 316.66 0.06% -> 368.74 0.24%: 1.16x slowdown
?
xlib-rgb paint_linear_rgb_over-512 336.98 0.11% -> 390.83 0.13%: 1.16x slowdown
?
xlib-rgb paint_linear_rgba_over-512 647.53 0.22% -> 699.27 0.22%: 1.08x slowdown
?
-------------- next part --------------
Speedups
========
xlib-rgba fill_image_rgba_over-256 41.33 0.67% -> 37.82 1.17%: 1.09x speedup
?
xlib-rgba fill_image_rgb_over-256 42.24 0.82% -> 38.90 1.12%: 1.09x speedup
?
xlib-rgba fill_image_rgba_source-256 55.00 0.81% -> 51.63 0.34%: 1.07x speedup
?
xlib-rgba fill_image_rgb_source-256 55.79 0.34% -> 52.83 0.73%: 1.06x speedup
More information about the cairo
mailing list