[cairo] path storage optimizations ( was: how to propose a change? )

Fri Aug 31 09:40:57 PDT 2007

i got three responses -- so let me just pull them together into one chain.

carl, re: submitting this a patch for people to look at.

definitely.  once i get the latest cairo compiling on my box i will post 
patches in the order you suggest.  i probably should have waited to post 
till i got it all working; sorry about that.  [ i have to run out of 
town next week - so there may be some delay on that front ]

owen and behdad, re: why.

mostly because i'm interested in how cairo works.  cairo helps fill a 
void in the game programming world, its license is amenable to usage on 
game consoles, it's cool stuff!

the path stuff is nearer to the surface of cairo - and i felt that there 
was a small bit of optimization that could be eek'd out there.

while a generalized memory management system for cairo would help out 
more -- reducing malloc frequency and overall memory usage is important 
( most consoles don't have memory page swapping so fragmentation can be 
a big issue ).

i'm sure there are much more important bottlenecks around and maybe as i 
learn more about cairo i can begin to help out with those.

behdad, re: sizes.

the macro CAIRO_PATH_BUF_SIZE controls the number of points ( and ops ) 
in each cairo_path_buf_t.  under gcc ( and your latest patch: vc ) that 
should be 464/5=51.555 elements yielding a size of 476 bytes for 
cairo_path_buf_t.  475 in my printout appears to come from a difference 
in rounding due to the precision toggling i do for mixing d3d and cairo.

at any rate, with the "stable release" code, i see:
sizeof(cairo_path_buf_t)=476;
sizeof(cairo_path_fixed_t)=500.

a math error in my reporting of the gcc totals does make the highwater 
gcc counts in my former email incorrect.  the original code does much 
better than i advertised.

the actual highwater original buffer usage is: 27,396 bytes w/ 46 
buffers alloced.  the highwater command stream usage is still: 24,176 
bytes w/ 40 streams alloced.

so the changes bring only an improvement of 3k and 6 allocs in this case.

all, re: is it worth it?

i think so.  whether things like a tad fewer allocs and a tad less 
memory used trump code maintenance is a definitely an important 
question.  in this case, i think the new code is just different than the 
old code, not less maintainable.  previously you had to understand the 
why of checking for flags like "current point" and peeks for "move tos" 
-- now you have to grok a small state machine, and a null terminated 
list of draw commands.  neither is particularly problematic, tho both 
require some thought.

once i get back from my trip - i will figure out the build issues, make 
patches against latest, and re-post so y'all can take a closer look-see.

oh. and sorry about the earlier html emails; still learning my way 
around thunderbird. ^_^;;

-io.
http://www.ionous.net

Carl Worth wrote:
> On Thu, 30 Aug 2007 14:12:56 -0500, ionous wrote:
>> I have a version of the optimizations I described up and running now -
> 
> Hello again!
> 
> First off, thank you very much for going back to non-HTML mail. That's
> really appreciated, (it's quite hard for me to not just delete HTML
> mail unread).
> 
>> I've had some problems getting the very latest source ( pixman in 
>> particular, but also some minor stuff with the tests ) compiling under 
>> windows ( both vc and cygwin/gcc ) -- I will detail and ask for help in 
>> a separate mail.
> 
> Thanks, please do.
> 
>> For the moment, tho, I've attached the source rather than git patch,
>> in case anyone wants to take a look.
> 
> The patch would actually allow me to take a look, (read it in my email
> program, read only the actual changed code, reply to actual quotes of
> the code, and if desired, apply the patch for testing). By contrast,
> complete source code files aren't of much use for any of those
> purposes, (particularly zipped).
> 
> And all of that holds even if the code is preliminary and not intended
> for being applied to cairo's tree---a patch is still greatly preferred.
> 
>> All in all looks pretty good -- checking out some stress tests with more 
>> heavy uses of primitives would be interesting.
> 
> Yes, in general I'm always in favor of improving things. Keith wrote
> the original chunky-path storage, and Behdad rewrote it later to
> reduce memory consumption. So another rewrite now to improve it more
> sounds great. Maybe Behdad will have some comment, (or maybe he too
> will wait to see a patch).
> 
>> Open Questions:
>> * Is the code managing reverse traversal of paths necessary. I didn’t 
>> (re)implement it in the attached because no one appears to be using it.
> 
> If there are no callers then please remove it. As that's a logically
> separate step, please do that as a separate patch/commit before your
> subsequent work.
> 
>> Regarding the attached code -- it includes a whole bunch of logging code 
>> that while useful does make the code less maintainable -- I will 
>> probably rip it out before making a real patch.
> 
> And that sounds like something that would work well has a third
> commit, (so that you could submit the complete patch series with three
> commits, but perhaps only the first two would be intended for/applied
> to cairo itself).
> 
>> * What size should cairo_path_fixed_t be set to? comments indicate 512 
>> bytes is desirable but I don't think gcc nor visualc achieve that. gcc 
>> -- assuming my macro expansions are right -- comes in at a very odd 499 
>> bytes.
> 
> I have no suggestion here. Behdad?
> 
> -Carl