[HarfBuzz] Tai Tham / Lanna (iso15924="lana") shaping question

Behdad Esfahbod behdad at behdad.org
Wed May 23 16:15:48 PDT 2012


On 05/23/2012 06:48 PM, Andrew Cunningham wrote:
> I think what Ed is saying is that Tai Tham follows a similar model to Myanmar
> rather than a pure Indic model, where you have a distinct medials vs subjoined
> consonants wher subjoined consonants require a virama and medials don't

I see.  Thanks for the clarification.

> Par of a fundamental change between myanar in unicode 4.1 and 5.1

Good to know.  I'll give HB a run on my Myanmar corpus and see if I can fix a
few high-impact issues.

> Will look at my sources to confirm for Tai Tham.

Thanks,
b

> A.
> 
> On Thursday, 24 May 2012, Behdad Esfahbod <behdad at behdad.org
> <mailto:behdad at behdad.org>> wrote:
>> Hi Thep,
>>
>> Humm, the message from Ed hat you are replying to never made it to me or to
>> the list.  Replies inline.
>>
>>
>> On 05/23/2012 06:53 AM, Theppitak Karoonboonyanan wrote:
>>> Hi, Ed, Behdad,
>>>
>>> On Sun, May 20, 2012 at 3:45 AM, Ed Trager <ed.trager at gmail.com
> <mailto:ed.trager at gmail.com>> wrote:
>>>> On Fri, May 18, 2012 at 5:48 PM, Behdad Esfahbod <behdad at behdad.org
> <mailto:behdad at behdad.org>> wrote:
>>>>> On 05/18/2012 04:02 PM, Ed Trager wrote:
>>>>>>
>>>>>> In Tai Tham, U+1A6E VOWEL SIGN E needs to be shifted all the way to
>>>>>> the left so that the final visual appearance would be:
>>>>>
>>>>> Are you sure?  Without U+1A60 TAI THAM SIGN SAKOT before the subjoined
>>>>> consonant?  Reading Unicode suggests that you need that sign betwee PA
> and LA.
>>>>
>>>> For most subjoined consonants, yes, that's true.  But note in
>>>> particular that U+1A56 MEDIAL LA and U+1A57 MEDIAL LA TANG LAI were
>>>> encoded separately.  In the case of these two "LA" signs, I believe
>>>> there are two reasons justifying the separate encoding:
>>>>
>>>> (1) These are variant forms of the same subjoined letter LA:
>>>> apparently, there is no other good way to do it other than encoding
>>>> both.
>>>>
>>>> (2) Both of these LA signs can be part of triple consonant clusters,
>>>> i.e. "KLW" appears in the common word Thai / Tai word for banana,
>>>> กล้วย, "klwy" .  In Tai Tham, both the L and the W appear as
>>>> below-base stacked forms (and actually the "y" is also a subjoined
>>>> form, but it's kind of hanging off the right side of the whole stack).
>>
>> I'm not questioning the separate encoding.  I don't care :-).  What I'm saying
>> is that you need a SAKOT before them for them to be considered part of the
>> same syllable according to the Indic OpenType spec and my implementation.
>> Now, if you think Unicode intended these to subjoin without a SAKOT, then I
>> like you to point me to documentation about that.
>>
>> If that is the case, we would need changes to the Indic machine.  Not
>> impossible, but I first want to make sure that it is indeed the case.
>>
>> behdad
>>
>>
>>
>>>> There are some other separately-encoded subjoining consonant signs:
>>>> U+1A5B, U+1A5C, U+1A5D, U+1A5E.
>>>
>>> Please also count U+1A55 (MEDIAL RA) in the rule, although it's not a
>>> subjoined form.
>>>
>>> Regards,
>>> -Thep.
>> _______________________________________________
>> HarfBuzz mailing list
>> HarfBuzz at lists.freedesktop.org <mailto:HarfBuzz at lists.freedesktop.org>
>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>>
> 
> -- 
> Andrew Cunningham
> Senior Project Manager, Research and Development
> Vicnet
> State Library of Victoria
> Australia
> 
> andrewc at vicnet.net.au <mailto:andrewc at vicnet.net.au>
> lang.support at gmail.com <mailto:lang.support at gmail.com>



More information about the HarfBuzz mailing list