[HarfBuzz] Thai below-base normalization

Martin Hosken mhosken at gmail.com
Mon Feb 3 05:11:49 CET 2014

Dear Richard,

> Note that U+0E3A does occur following upper vowels (U+0E34-7).


> Does this denote a rendering interaction, or is <U+0E3A, U+0E34> just
> the obvious way of entering what someone (who?) says should be <U+0E34,
> U+0E3A>?  <U+0E34, U+0E3A> breaches the rule of marks below and then
> marks above.

Firstly, there is no such rule. It's just a convention.

Secondly, the U+0E34 has CCC=0 and therefore the U+0E3A can occur before or after and it and <sarcasm>all wonderful</sarcasm> normalization algorithm will sort it out. Given that in this context the pintu is modifying the vowel, it seems natural to store it after the vowel (hence U+0E34 U+0E3A). But there can be other language contexts in which the pintu is modifying the consonant and so the opposite order would be used. Perhaps this is a case where normalization might have helped and a less than optimal linguistic order would have resulted in clearer data storage. Either way, implementations should handle both orders (which given the current CCC values, is what will happen).


