[HarfBuzz] Tai Tham / Lanna (iso15924="lana") shaping question

Behdad Esfahbod behdad at behdad.org
Wed May 23 10:06:15 PDT 2012

Hi Thep,

Humm, the message from Ed hat you are replying to never made it to me or to
the list.  Replies inline.

On 05/23/2012 06:53 AM, Theppitak Karoonboonyanan wrote:
> Hi, Ed, Behdad,
> On Sun, May 20, 2012 at 3:45 AM, Ed Trager <ed.trager at gmail.com> wrote:
>> On Fri, May 18, 2012 at 5:48 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
>>> On 05/18/2012 04:02 PM, Ed Trager wrote:
>>>> In Tai Tham, U+1A6E VOWEL SIGN E needs to be shifted all the way to
>>>> the left so that the final visual appearance would be:
>>> Are you sure?  Without U+1A60 TAI THAM SIGN SAKOT before the subjoined
>>> consonant?  Reading Unicode suggests that you need that sign betwee PA and LA.
>> For most subjoined consonants, yes, that's true.  But note in
>> particular that U+1A56 MEDIAL LA and U+1A57 MEDIAL LA TANG LAI were
>> encoded separately.  In the case of these two "LA" signs, I believe
>> there are two reasons justifying the separate encoding:
>> (1) These are variant forms of the same subjoined letter LA:
>> apparently, there is no other good way to do it other than encoding
>> both.
>> (2) Both of these LA signs can be part of triple consonant clusters,
>> i.e. "KLW" appears in the common word Thai / Tai word for banana,
>> กล้วย, "klwy" .  In Tai Tham, both the L and the W appear as
>> below-base stacked forms (and actually the "y" is also a subjoined
>> form, but it's kind of hanging off the right side of the whole stack).

I'm not questioning the separate encoding.  I don't care :-).  What I'm saying
is that you need a SAKOT before them for them to be considered part of the
same syllable according to the Indic OpenType spec and my implementation.
Now, if you think Unicode intended these to subjoin without a SAKOT, then I
like you to point me to documentation about that.

If that is the case, we would need changes to the Indic machine.  Not
impossible, but I first want to make sure that it is indeed the case.


>> There are some other separately-encoded subjoining consonant signs:
>> U+1A5B, U+1A5C, U+1A5D, U+1A5E.
> Please also count U+1A55 (MEDIAL RA) in the rule, although it's not a
> subjoined form.
> Regards,
> -Thep.

More information about the HarfBuzz mailing list