[HarfBuzz] Tai Tham / Lanna (iso15924="lana") shaping question

Behdad Esfahbod behdad at behdad.org
Tue May 22 18:55:01 PDT 2012

On 05/19/2012 04:45 PM, Ed Trager wrote:
> Hi, Behdad et al.!

Hi Ed,

> I'm getting different results depending on how I set up GSUB lookup
> features in my font, so let me first check whether I'm doing things in
> a reasonable way:

Right.  That's very expected.

> (1) First, I'm running hb-view specifying "lana" as the script:
> hb-view --script=lana --text-file=test_prefix_vowels.utf8
> Hariphunchai.otf > result.png
> Is this enough, or am I also supposed to specify certain OpenType
> features to make things work correctly?

Enough.  You shouldn't even need to set the script as we autodetect that.

> (2) Secondly, in my font I currently have lookups for U+1A60 SAKOT +
> <some consonant> which map to the subjoined consonant forms.  For
> these, I currently am using the "ccmp" tag with script set to
> "lana{dflt}".

Since you are relying on the Indic shaper, I suggest you do that in another
feature.  I *think* the 'blwf' feature is right in this case.

However, I imagine why that wouldn't work with current HarfBuzz.  Currently if
the OpenType script tag does not end with '2', we reverse the order of the
halant and the consonant.  That needs to be fixed since I think we want the
"new Indic" behavior for scripts without an old/new OpenType spec distinction.
 I can fix that.

I assume Uniscribe does not support Tai Tham.

> 2.1. NOTE: If I change the script from "lana{dflt}" back to
> "DFLT{dflt}", then HarfBuzz gives different results.

Expected, since it wouldn't go through the Indic shaper.  Or do you mean if
you change the font this happens?  If that's the case, it's NOT expected.  At
any rate, you should be using 'lana' not 'DFLT'.

> 2.2. I also tried changing the tag from "ccmp" to "clig" but HarfBuzz
> still does not give expected results.  Is "ccmp" the best tag to use?
> Perhaps someone with more experience can give me some advice on this
> ... ?

See above.

I can debug this if you send me the font (preferably using the 'blwf' feature
mentioned above).


> On Fri, May 18, 2012 at 5:48 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
>> On 05/18/2012 04:02 PM, Ed Trager wrote:
>   ... ... ...
>>> In Tai Tham, U+1A6E VOWEL SIGN E needs to be shifted all the way to
>>> the left so that the final visual appearance would be:
>> Are you sure?  Without U+1A60 TAI THAM SIGN SAKOT before the subjoined
>> consonant?  Reading Unicode suggests that you need that sign betwee PA and LA.
> For most subjoined consonants, yes, that's true.  But note in
> particular that U+1A56 MEDIAL LA and U+1A57 MEDIAL LA TANG LAI were
> encoded separately.  In the case of these two "LA" signs, I believe
> there are two reasons justifying the separate encoding:
> (1) These are variant forms of the same subjoined letter LA:
> apparently, there is no other good way to do it other than encoding
> both.
> (2) Both of these LA signs can be part of triple consonant clusters,
> i.e. "KLW" appears in the common word Thai / Tai word for banana,
> กล้วย, "klwy" .  In Tai Tham, both the L and the W appear as
> below-base stacked forms (and actually the "y" is also a subjoined
> form, but it's kind of hanging off the right side of the whole stack).
> There are some other separately-encoded subjoining consonant signs:
> U+1A5B, U+1A5C, U+1A5D, U+1A5E.
>>  In which case, HarfBuzz will recognize the entire thing as one syllable and
>> you get the vowel sign correctly shifted all the way to the left.
> OK, but I'm still not getting expected results.  For now I've just
> attached a single simple example where there is a single base
> consonant followed directly by U+1A6F VOWEL SIGN AE ; and then a
> subjoined consonant after that.  If Tai Tham is really supposed to be
> entered in phonetic order, then this should be the correct setup.
> However you can see (in the attached image) that the subjoined
> consonant U+1A37 remains attached to the trans-positioned vowel sign
> U+1A6F but this is not what is supposed to occur -- see attached
> image.
> ( Image was generated using hb-view as shown at the top of this email
> and from the font using "ccmp" tag with script "lana{dflt}" )
> Best Wishes -- Ed
>> behdad
>>> "EPL"
>>> (in this email we will ignore the fact that in reality the "L" needs
>>> to be subjoined and hang below the "P")
>>> But what I get from HarfBuzz is only this:
>>> "PEL"
>>> ... which is of course wrong.
>>> Can someone please confirm that, based on the current code, the
>>> expected behavior of HarfBuzz at this point in time cannot do anything
>>> other than what I have just described?
>>> _______________________________________________
>>> HarfBuzz mailing list
>>> HarfBuzz at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>>> _______________________________________________
>>> HarfBuzz mailing list
>>> HarfBuzz at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz

More information about the HarfBuzz mailing list