[Pixman] PDF radial gradients

Fri Sep 10 09:26:40 PDT 2010

On Fri, Sep 10, 2010 at 9:51 AM, Soeren Sandmann <sandmann at daimi.au.dk> wrote:
> ...
> I'd guess on most systems, arithmetic on floating point values will be
> much faster than arithmetic on int128s, simply because an int128 would
> have to be built up from two 64 bit, or four 32 bit registers.

I tried to do some benchmarking on a 64-bit cpu and I have pushed four branches
with different mixes of fixed and floating point math:

- wip/radial-fixsqrt does all the computations in fixed point. It is
slower because
of iterative sqrt and integer division. I think it can be interesting
as a reference,
but its performance is not acceptable.

- wip/radial does all the computations before the sqrt in fixed point
(thus requires
some 128-bit variables, but just additions, which seems to be fast on
my core i7).
Most of the time is spent (wasted?) in the sqrt and in the
int128->double conversion.

The conversion might maybe be made faster (currently it is a call to a fallback
function in the system library), but doing just the final computations in double
precision seems to be fast enough and should not lose much precision in
the interesting cases.

- wip/radial-float2 computes the discriminant in double procision
(instead of 128
bit fixed point) and should be both fast (almost two times faster than
wip/radial)
and accurate. Does it look good?
I'm a little worried about the int64->double conversions in this
branch. Are there
architectures where it might be a problem?

- in wip/radial-float I tried to keep all the variables of the inner
loop in floating point
to see how much this affects the performance. I get small speed
improvements, but
I don't think they justify this branch (the accuracy guarantees in
this case are very
loose since errors accumulate quadratically with the number of iterations).

All the branches could be affected by problems when computing the solutions
of the 2nd degree equation, since I used the "school formula", which is not
numerically stable. I will try to find out if there are interesting
cases in which this
can be a problem.

If somebody can confirm that wip/radial-float2 should work fine on most
32-bit architectures, I'll clean it up some more so that it will be
ready to be merged.

Again, thank you for your suggestions (which were in fact very insightful)
Andrea