[HarfBuzz] HarfBuzz API design

Martin Hosken martin_hosken at sil.org
Tue Aug 18 23:57:33 PDT 2009

Dear Behdad,

I feel that this is the core of the API since it specifies what inputs and outputs harfbuzz works with (particularly outputs).

> typedef struct _hb_glyph_info_t {
>    hb_codepoint_t codepoint;
>    hb_mask_t      mask;
>    uint32_t       cluster;
>    uint16_t       component;
>    uint16_t       lig_id;
>    uint32_t       internal;
> } hb_glyph_info_t;

I may have misinterpretted but mask, lig_id and probably component, feel to be OT specific in that a consumer of the output is unlikely to ever need them.

The disadvantage I see with having a single buffer that changes its contents from chars to glyphs is that then you lose the association map between underlying chars and glyphs. I suppose it can be recreated using the component information, but it's going to be problematic when it comes to cursor hit testing.

> For script and language, it's a bit more delicate.  I'm also convinced that 
> they belong to the buffer.  With script it's fine, but with language it 
> introduces a small implementation hassle: that I would have to deal with 
> copying/interning language tags, something I was trying to avoid.  The other 
> options are:
>    - Extra parameters to hb_shape().  I rather not do this.  Keeping details 
> like this out of the main API and addings setters where appropriate makes the 
> API cleaner and more extensible.
>    - Use the feature dict for them too.  I'm strictly against this one.  The 
> feature dict is already too highlevel for my taste.

Why do you say the feature dict is too high level? It seems just the right place, to me. Or it could be stored in the buffer, since it is buffer specific.

One question: is a buffer representing a single run for which the language doesn't change or is it potentially multiple runs that are yet to be segmented?


More information about the HarfBuzz mailing list