[HarfBuzz] A couple of clarifications regarding HarfBuzz
Behdad Esfahbod
behdad at behdad.org
Thu Oct 21 11:55:51 PDT 2010
On 10/21/10 04:10, Tom Hacohen wrote:
>> Language is used to do language-specific adjustments when appropriate. You
>> typically just pass the locale or whatever your higher-level tells you (think
>> of lang attribute in html) to hb_language_from_string.
>
> As I thought, thanks, I wasn't thinking about languages using the same
> script like many of the latin languages and their ligatures.
It's more than just Latin.
>> HarfBuzz does the right thing no matter what you pass in. So you can safely
>> pass 0. String length in characters would be most appropriate if you have it.
>
> I assumed HarfBuzz does well anyway, but I want the fastest way
> possible. Ok then, I have the string's length (as it's needed for
> buffer_add anyway).
If you have UTF-32 or UTF-16, just pass the length indeed. For UTF-8, passing
the byte length will overshoot by a factor of 2 or 3 for anything but ASCII.
You need the # of characters, not # of bytes, etc.
>> The low-level API to fetch that information from GDEF is available through
>> hb_ot_layout_get_lig_carets(), however, very few fonts provide such
>> information. It's common to just divide the width by the number of graphemes.
>
> graphemes being non diacritic glyphs?
Graphemes are what a user (of a language) considers to be one entity. Unicode
defines them:
http://www.unicode.org/reports/tr29/
We may add code in harfbuzz for that in the future. A cheap heuristic is to
check for combining-class=0.
behdad
> Thanks a lot,
> Tom.
>
>
More information about the HarfBuzz
mailing list