[HarfBuzz] Tai Tham NGA, SAKOT is not Kinzi

Theppitak Karoonboonyanan thep at linux.thai.net
Thu Apr 25 05:22:17 PDT 2013


On Thu, Apr 25, 2013 at 8:55 AM, Richard Wordingham
<richard.wordingham at ntlworld.com> wrote:
> On Sun, 21 Apr 2013 22:07:53 +0700
> Theppitak Karoonboonyanan <thep at linux.thai.net> wrote:
>
>> This could be the solution we're seeking. But how should the font do
>> the signalling?
>
> The method used before was the presence or absence of a certain
> substitution within some feature.
>
> [in another mail]:
>
> According to
> http://lists.freedesktop.org/archives/harfbuzz/2013-January/002842.html ,
> the need for MEDIAL RA to be reordered is signalled by having a lookup
> for it in the 'pref' feature.  (I am not sure of the point of
> implementation of this.)  The same could be done for MAI KANG LAI in
> the 'abvf' feature.

In fact, the signal for that case is actually the presence of 'lana' script tag.
Harfbuzz will ignore all tables for 'DFLT' script tag if 'lana' tag is present,
to avoid the conflict with fallback rules.

So, what should be the contents of 'abvf' for signalling? Full reordering rules
or just emptiness? The latter is quite awkward design, IMHO. For the former,
I think it could be a single 'ccmp' fallback with both 'DFLT' and 'lana' script
tags instead, and Harfbuzz doesn't need to handle Mai Kang Lai at all,
*except* allowing GSUB rules to be applied across text clusters.

>> This makes me get back to read to your thread starting post more
>> carefully. Yeah, you said Mai Kang Lai is shifted right to the midway
>> between the first and the second consonant. I read that as "shifting
>> school". Probably, we should check how "sangkho" is written in Khuen,
>> then.
>
> I couldn't find any examples.  But I did find an example of อัญเชิญ
> (first line of my Khuen sample in
> http://homepage.ntlworld.com/richard.wordingham/lanna/maikanglai.pdf ),
> with the first nasal spelt with MAI KANG LAI. I think that should be
> good enough.

OK. We can conclude for Khuen, then, unless we can find counter-examples
like in Lanna.

>> I withdraw my claim that it would be less problematic to let Lao shift
>> Mai Kang Lai in the font. I've experimented with the SAKOT-less
>> encoding scheme and I've got boundary problem with some words
>> like <SA, MAI KANG LAI, LOW KHA, RA, HIGH TA, NA, HIGH PA,
>> NA, AA, MA> (สงฺฆรตนปณาม). With SAKOT-less encoding scheme,
>> Mai Kang Lai continues being shifted over following consonants.
>> But as the rule comprises multiple stages, the shift is incomplete
>> and causes duplicates of Mai Kang Lai along the rendered text.
>> Getting over this would be tricky.
>
> I assume yuo're using the ligature substitution (look-up type 4)
> followed by the mulitiple substitution (look-up type 2).  One solution
> is to use different glyphs for swapped and unswapped MAI KANG LAI.

OK. I've tried it and it works. The problem is that it only works in Fontforge
metrics window, not on Harfbuzz. It looks like Harfbuzz does not allow
GSUB rules to be applied across text cluster boundary currently (while old
Pango does).

>> Regarding the question about multiple forms of the same word,
>> it's already true. For example, "sangkho" can be written either:
>> - <HIGH SA, MAI KANG, LOW KHA, E, AA>
>> - <HIGH SA, NGA, SAKOT, LOW KHA, E, AA>
>> - <HIGH SA, MAI KANG LAI, [SAKOT,] LOW KHA, E, AA>
>> What if we accept that the last one can be split into 2 different
>> forms? Just like the multiple forms of "tanglai", it should not be a
>> surprise if there exists a book that explains the several ways to
>> write "sangkho" in Lanna by considering the shifting and non-shifting
>> Mai Kang Lai as different forms.
>
> I think the appropriate character to distinguish the last two forms is
> ZWNJ.

This is a good idea. Use ZWNJ to prevent the shift.

> I'm pleased you've found the spelling with plain MAI KANG; I was
> wondering what had happened to it.

I have to say I haven't found a real evidence. I just list it here as
a possibility
in principle. At least, I've found "สํโฆ" in Thai script, in addition
to "สงฺโฆ".

Regards,
--
Theppitak Karoonboonyanan
http://linux.thai.net/~thep/



More information about the HarfBuzz mailing list