[HarfBuzz] A couple of clarifications regarding HarfBuzz
Tom Hacohen
tom.hacohen at partner.samsung.com
Sun Oct 24 01:50:49 PDT 2010
Dear Behdad,
My replies are below.
On Thu, 2010-10-21 at 14:55 -0400, Behdad Esfahbod wrote:
> It's more than just Latin.
Yeah, also Yiddish and Hebrew, Arabic and Persian I know them, they just
didn't pop to my head.
>
> If you have UTF-32 or UTF-16, just pass the length indeed. For UTF-8, passing
> the byte length will overshoot by a factor of 2 or 3 for anything but ASCII.
> You need the # of characters, not # of bytes, etc.
I'm working with UTF-32 anyway. But yes, of course # of chars and not
byte len.
> Graphemes are what a user (of a language) considers to be one entity. Unicode
> defines them:
>
> http://www.unicode.org/reports/tr29/
>
> We may add code in harfbuzz for that in the future. A cheap heuristic is to
> check for combining-class=0.
Thank you very much. But does HarfBuzz expose such info? (combining
class), as I didn't see anything about it.
Thanks a lot,
Tom.
More information about the HarfBuzz
mailing list