[HarfBuzz] Thai below-base normalization

Jonathan Kew jfkthame at googlemail.com
Thu Jan 23 08:13:50 PST 2014


On 23/1/14 15:41, Theppitak Karoonboonyanan wrote:
> On Thu, Jan 23, 2014 at 9:41 PM, Jonathan Kew <jfkthame at googlemail.com> wrote:
>> On 21/1/14 03:56, Theppitak Karoonboonyanan wrote:
>>>
>>> I'm trying to typeset Patani Malay text using Thai script as guided by
>>> the Royal Institute, and I have some problems with Phinthu-
>>> modified consonants with below-base vowel combined.
>>>
>>> See the sample text captured from the book here:
>>>
>>> http://linux.thai.net/~thep/shots/20140121-patani-sample-marked.jpg
>>
>> Interesting. Are there any cases (in Thai or other languages) where the
>> Phinthu *is* written below the Sara-U or Sara-UU vowel?
>
> AFAIK, no. But my knowledge is limited in this area.
>
>> The comment in hb-unicode-private.hh suggests that Uniscribe behaves the
>> same as (current) harfbuzz here. There appear to be only two examples of
>> this in our thai-wikipedia word list, but I just checked, and Uniscribe does
>> indeed render the Phinthu dot *below* the vowel in both cases:
>>    <U+0E1B,U+0E3A,U+0E38,U+0E04,U+0E04,U+0E42,U+0E25>
>
> This is a misspelt Pali word. Correction:
>
>   <U+0E1B,U+0E38,U+0E04,U+0E3A,U+0E04,U+0E42,U+0E25>
>
>>    <U+0E28,U+0E23,U+0E32,U+0E27,U+0E3A,U+0E38,U+0E12,U+0E34>
>
> This is also a Sanskrit typo. U+0E3A should be removed to correct it.

OK, thanks - I wondered whether these might be errors, given how rare 
the combination seemed to be.

It's not clear to me, then, why uniscribe treats this in the way it 
does. (Perhaps there was no good reason, and it was merely an arbitrary 
choice of ordering in the absence of any clear requirement?)

JK



More information about the HarfBuzz mailing list