[HarfBuzz] A couple of clarifications regarding HarfBuzz

Tom Hacohen tom.hacohen at partner.samsung.com
Sun Oct 24 01:50:49 PDT 2010


Dear Behdad,

My replies are below.

On Thu, 2010-10-21 at 14:55 -0400, Behdad Esfahbod wrote:
> It's more than just Latin.
Yeah, also Yiddish and Hebrew, Arabic and Persian I know them, they just
didn't pop to my head.
> 

> If you have UTF-32 or UTF-16, just pass the length indeed.  For UTF-8, passing
> the byte length will overshoot by a factor of 2 or 3 for anything but ASCII.
> You need the # of characters, not # of bytes, etc.

I'm working with UTF-32 anyway. But yes, of course # of chars and not
byte len.
> Graphemes are what a user (of a language) considers to be one entity.  Unicode
> defines them:
> 
>   http://www.unicode.org/reports/tr29/
> 
> We may  add code in harfbuzz for that in the future.  A cheap heuristic is to
> check for combining-class=0.

Thank you very much. But does HarfBuzz expose such info? (combining
class), as I didn't see anything about it.

Thanks a lot,
Tom.




More information about the HarfBuzz mailing list