[cairo] Patches for speeding up _cairo_fixed_from_double

Mon Nov 6 12:59:42 PST 2006

On Wed, 1 Nov 2006 22:53:33 -0800, "Daniel Amelang" wrote:
> Then there's the new _cairo_fixed_from_double that uses
> FLOAT_WORDS_BIGENDIAN to pull off the magic number trick in a portable
> manner.

[I had made some earlier comments in a draft message, only to lose
them when I let my laptop batter die---I hadn't realized my email
program defeated emacs' auto-save features so well. *sigh*]

So I have a couple of comments on the patch itself, but it was so good
overall, that I was glad to have it land in the tree as is and we can
fix these things up afterward.

> +case $ax_cv_c_float_words_bigendian in
...
> +  *)
> +    m4_default([$3],
> +      [AC_MSG_ERROR([
> +
> +Unknown float word ordering. You need to manually preset
> +ax_cv_c_float_words_bigendian=no (or yes) according to your system.

First, under what conditions is this path expected to execute? Missing
grep I suppose? Rather than breaking the build here, should we perhaps
just disable the optimization in this case?

And the error message should provide the user with more direction,
especially if it's going to break the build. For example, where and
how would I "manually preset" this variable and value? How might I
determine the correct value?

> + * Although this approach provides a very large speedup of this function
> + * on a wide-array of systems, it does come with two caveats:
> + *
> + * 1) It uses banker's rounding as opposed to arithmetic rounding.

I don't think that's actually a problem for cairo. For example, we
already know we are using way too many bits of sub-pixel precision,
(which I hope to start fixing very soon), so what happens in the last
bit is really, really not important at all.

It is good for this to be documented here, but it should just state
this upfront rather than apologizing about it.

> + * 2) It doesn't function properly if the FPU is in single-precision
> + *    mode.

Is this really true? If so, this is a problem. People have encountered
bugs with cairo in single-precision FPU mode before. For example, see:

	_cairo_color_compute_shorts fails with FPU set to single precision
	https://bugs.freedesktop.org/show_bug.cgi?id=7497

So we've fixed bugs in this area before, and I'd like to not make any
regressions here. It would be a nasty thing to check for since it's a
run-time condition. But perhaps the current code works fine. Have you
tested and verified failure in single-precision mode?

> + * This test was originally created to test _cairo_fixed_from_double.
> + * cairo_pattern_create_radial was selected as the entry point into
> + * cairo as it makes several calls to _cairo_fixed_from_double and
> + * presents a somewhat realistic use-case (although the RADIALS_COUNT
> + * isn't very realistic).
> + */
> +#include <time.h>
> +#include "cairo-perf.h"
> +
> +#define RADIALS_COUNT (100000)

As I mentioned in my other mail, the test case really doesn't have to
be realistic overall as long as it is testing something that is
significant for some case that is realistic. So I accepted this test
as is.

I did reduce the count by an order of magnitude to make it run a bit
quicker. I did that in a sneaky way by actually modifying the code
before committing something that has you as the author.

Normally I would have made that kind of change in a subsequent commit,
but I didn't want the performance results to claim that that
subsequent commit suddenly made this test run 10x faster.

Since then, I've eliminated that problem, so I won't be tempted to be
so sneaky in the future. I fixed cairo-perf-diff so that it always
runs the latest version of the performance suite even when running
against old versions, and I also fixed its caching to depend on the
state of the performance suite, (conveniently captured by git as the
sha1 of the cairo/perf tree object).

So in the future, we can feel free to change the performance tests and
cairo-perf-diff will simply ignore the old results that have become
stale as a result. (Thanks to Josh Triplett for suggesting this
approach.)

> +    srand (time (0));

I really don't like seeing non-determinism in the performance test
suite. In this case, we really shouldn't care what the actual double
values are, but it still looks creepy to me. I switched to a constant
seed for srand in the commit I made. (I considered doing a
pre-generated table as in tessellate case, but again we really don't
care about the values so it wouldn't be worth the bloat I don't
think.)

-Carl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20061106/f719497e/attachment.pgp