[HarfBuzz] Tai Tham / Lanna (iso15924="lana") shaping question

Ed Trager ed.trager at gmail.com
Sat May 19 13:45:55 PDT 2012

Hi, Behdad et al.!

I'm getting different results depending on how I set up GSUB lookup
features in my font, so let me first check whether I'm doing things in
a reasonable way:

(1) First, I'm running hb-view specifying "lana" as the script:

hb-view --script=lana --text-file=test_prefix_vowels.utf8
Hariphunchai.otf > result.png

Is this enough, or am I also supposed to specify certain OpenType
features to make things work correctly?

(2) Secondly, in my font I currently have lookups for U+1A60 SAKOT +
<some consonant> which map to the subjoined consonant forms.  For
these, I currently am using the "ccmp" tag with script set to

2.1. NOTE: If I change the script from "lana{dflt}" back to
"DFLT{dflt}", then HarfBuzz gives different results.

2.2. I also tried changing the tag from "ccmp" to "clig" but HarfBuzz
still does not give expected results.  Is "ccmp" the best tag to use?
Perhaps someone with more experience can give me some advice on this
... ?

On Fri, May 18, 2012 at 5:48 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
> On 05/18/2012 04:02 PM, Ed Trager wrote:
  ... ... ...
>> In Tai Tham, U+1A6E VOWEL SIGN E needs to be shifted all the way to
>> the left so that the final visual appearance would be:
> Are you sure?  Without U+1A60 TAI THAM SIGN SAKOT before the subjoined
> consonant?  Reading Unicode suggests that you need that sign betwee PA and LA.

For most subjoined consonants, yes, that's true.  But note in
particular that U+1A56 MEDIAL LA and U+1A57 MEDIAL LA TANG LAI were
encoded separately.  In the case of these two "LA" signs, I believe
there are two reasons justifying the separate encoding:

(1) These are variant forms of the same subjoined letter LA:
apparently, there is no other good way to do it other than encoding

(2) Both of these LA signs can be part of triple consonant clusters,
i.e. "KLW" appears in the common word Thai / Tai word for banana,
กล้วย, "klwy" .  In Tai Tham, both the L and the W appear as
below-base stacked forms (and actually the "y" is also a subjoined
form, but it's kind of hanging off the right side of the whole stack).

There are some other separately-encoded subjoining consonant signs:
U+1A5B, U+1A5C, U+1A5D, U+1A5E.

>  In which case, HarfBuzz will recognize the entire thing as one syllable and
> you get the vowel sign correctly shifted all the way to the left.

OK, but I'm still not getting expected results.  For now I've just
attached a single simple example where there is a single base
consonant followed directly by U+1A6F VOWEL SIGN AE ; and then a
subjoined consonant after that.  If Tai Tham is really supposed to be
entered in phonetic order, then this should be the correct setup.

However you can see (in the attached image) that the subjoined
consonant U+1A37 remains attached to the trans-positioned vowel sign
U+1A6F but this is not what is supposed to occur -- see attached

( Image was generated using hb-view as shown at the top of this email
and from the font using "ccmp" tag with script "lana{dflt}" )

Best Wishes -- Ed

> behdad
>> "EPL"
>> (in this email we will ignore the fact that in reality the "L" needs
>> to be subjoined and hang below the "P")
>> But what I get from HarfBuzz is only this:
>> "PEL"
>> ... which is of course wrong.
>> Can someone please confirm that, based on the current code, the
>> expected behavior of HarfBuzz at this point in time cannot do anything
>> other than what I have just described?
>> _______________________________________________
>> HarfBuzz mailing list
>> HarfBuzz at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2012.05.19_test_prefix_vowel_001.png
Type: image/png
Size: 57525 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20120519/aa59549d/attachment.png>

More information about the HarfBuzz mailing list