[HarfBuzz] HarfBuzz rewrite

Wed Jan 3 00:02:00 PST 2007

Hi Behdad,

On Friday 29 December 2006 10:19, Behdad Esfahbod wrote:
> Hi,
>
> For the past few days, I've been busy working on a rewrite of HarfBuzz.
> The main motivation here has been to be able to use an mmapped file
> instead of reading all tables into memory.  So far the result has been
> very promising, and I've finished the equivalent of harfbuzz-open (that
> is, Common Table Formats).  I structured the code into the same layout
> as the current code base.  Here it is:
>
>   git clone http://freedesktop.org/~behdad/harfbuzz-ng.git

Sounds good. I'll see if I can check out the code within the next days.
>
> There are some controversial decisions I had to make however.  A brief
> intro into the design, in no particular order:
>
>   - I decided to go with C++ internally, for the main reason of getting
> "free" (as in developer time) conversion to host endian-ness.  I had
> defined classes for all basic integer types that allow for using them
> like integers in arithmetic and depending on the overloaded operators to
> take care of the endian conversion.  I think this is a major
> optimization in sense of coding effort AND correctness.

I'm all for that. I can't understand why there are not more projects that use 
C++ at least internally. 

>   - Another benefit of using C++ is that it's now possible to enforce a
> lot of constraints, like const correctness and privacy.
>
>   - Writing the code has so far mostly turned out to be a direct
> translation of the spec tables into structs.  I even chose to use struct
> names directly form the spec, with no prefix, for internal use.  So I
> have types named Fixed, USHORT, and OffsetTable.  The idea is to have a
> separate public (C) API on top of this.

This is a little problematic, as the compiler is free to add paddings inside 
the structs (actually it's dependent on the ABI you have on the OS). It 
should work fine on x86, but you might run into problems on risk 
architectures. gcc has a workaround for this (the packed attribute). This is 
something we'll need to look into for other compilers (which we support with 
Qt).

>   - Out of the old harfbuzz code, more than one third is reading the
> tables, less than one third is to free them, and about one third is to
> actually use them for any operation.  In the new code base, thanks to
> the mmapped tables, only the last category is needed.

Good :)

>   - I'm writing a script to generate struct code from the tables off the
> standard.  After that, I plan to finish the gdef code, including
> "synthesized gdef tables", and after that, I plan to adapt the
> harfbuzz-buffer code and adjust the gpos/gsub code to use the new API.
> We will see how far that goes.
>
> Comments more than welcome.  I know that some of the choices I made
> (operator overloading, etc) are crazy.  I need to hear about them :),

I'll look at the code :)

> Cheers, and Happy New Year,

To you as well :)

Lars