[cairo] Reduce number of floating point operations

Daniel Amelang daniel.amelang at gmail.com
Tue Nov 21 12:41:40 PST 2006


On 11/21/06, Carl Worth <cworth at cworth.org> wrote:
> On Tue, 21 Nov 2006 15:56:52 +0200, Jorn Baayen wrote:
> >  o is-identity.diff replaces the 6 FP operations to check whether
> >    a matrix is an identity matrix with a call to memset().
> >
> >    This is not always faster.
>
> Shall we just store this data rather than recomputing it? It seems
> that an extra byte next to 6 doubles wouldn't be a huge cost, (unless
> alignment would make it cost more than that).
>
> We could have various flags in that byte. We definitely check for "is
> identity" and "is integer translation" a bunch right now for
> optimizations. So being able to do those two operations extremely fast
> might be helpful.

So I spent a day last week playing with this. Here's what I found:
is_identity isn't really a bottleneck (yet) as far as I can tell, so
this is sort of a moot point. But I assumed that it was, and tried
four approaches:

1) Use memcmp. Yea, you see a _slight_ speedup, but then you see that
memcmp starts to crawl up the profile. Funny thing is, we're just
performing a lot of memcmp on a lot of identity matrices over and over
again, so we can do better.

2) Use a 'untouched identity matrix' flag that is only set TRUE when a
matrix is created with init_identity. Minor speedup again, but this
time no memcpy crawling up the profile. Nice. Also, this flag doesn't
clutter up the matrix code very much because it's easy to just mark it
dirty anytime a translate or scale or matrix_multiply is called on it,
as opposed to keeping an is_identity flag always up-to-date. But, in
some cases, matrices _become_ identity, or perhaps an identity matrix
goes back to being one after some transformations, or what not, and we
miss those (very few cases). Plus, the semantics is a little weird. So
this may be a little too quick and dirty.

3) Use an 'is_identity' flag what has three values: yes, no, and
unknown. We set it to 'yes' in init_identity, 'no' in
init_translate|scale, and unknown in just plain old matrix_init. In
the transation|scale|multiply etc functions that modify matrices, we
just set the flag to 'unknown' and let the recomputing be done in
is_identity, if/when it gets called. Although this clutters the code a
bit, I think it's a nice tradeoff.

4) Use a matrix state flag as a bit flag for is_identity,
is_translation, etc. The code is now _very_ ugly, very tricky (did you
cover all possible state transitions?), and isn't any faster on any
benchmark that I have. Reason why is that is_translation isn't called
that much at all (assuming my version of the transform glyphs patch)
compared to is_identity, so this additional caching doesn't get you
very far. If is_translation really does show up on the profile one
day, we can create a memcmp version that copies the translation
components into a identity matrix and performs a matrix-wide compare
that way. Much cleaner IMO, assuming that we need to do it at all.

BTW, I found that once we start using flags to remember when a matrix
is identity, FP compares drop off the profile, so I'm not convinced
that going to a memcmp for is_identity is really worth it. Yea, it
probably doesn't hurt, but it probably doesn't help, and just makes
our code look a little funny.

Dan


More information about the cairo mailing list