[HarfBuzz] Editing with TAI THAM SIGN MAI KANG LAI

Richard Wordingham richard.wordingham at ntlworld.com
Sun Jun 9 04:36:48 PDT 2013

Dear List,

I am wrestling with how to render U+1A58 TAI THAM SIGN MAI KANG LAI in a
fashion that does not reduce users' ability to manually edit text
containing it.  The character represents /ŋ/ at the start of a
phonetic consonant sequence, and is generally written above the line of
base consonants.  (Signs above consonants sometimes descend into the
line of base consonants.)

Solutions I have considered seem to affect the text editing interface
with users, and this not an area I fully understand.  I am therefore
seeking advice.


The basic problem is that the association of the sign with the base
consonants is a matter of style.  Phonetically, it represents a final
consonant, and this has lead to some writing it as a mark associated
with the preceding syllable-initial consonant - what I will term the
*unshifted* position. This style is dominant in Burma.  Historically, it
behaved like Burmese kinzi and Devanagari repha, moving to whant I
will term the *shifted* position, and this style is dominant in Thailand
and Laos. I lack data on modern usage in China.

Conceptually, the simplest solution would be to have two different
shaping engines, depending on where mai kang lai is to be rendered.
This enables mai kang lai to be positioned as a mark on the
phonetically on the following consonant.  There are two

1) The syllable following /ŋ/ may be chained, in which case the two
styles are the same.  I believe this exception can be handled
straightforwardly by the rearrangement code, though I haven't
experimented with it yet.

2) There appears to be an intermediate style in which mai kang lai may
appear on either base consonant.  It occurs in the major Northern
Thai dictionary (MFL).  The selection rule is complex. It is also
possible that the layout rule is simply that mai kang lai is displaced
to the right when there are marks on the following consonant.  The
crucial test would be words with preposed vowels in the second
syllable, but the MFL has no examples.  I have no confidence that this
style is uniform, and therefore the rules cannot be embedded in the
shaping engine.  They therefore need to go in the font.

One possible solution I've been toying with is to map MAI KANG LAI to a
sequence of two glyphs before Indic rearrangement so that the first
glyph is in the unshifted position and the second is in the shifted
position.  The font can then choose which is to have zero extent and
which is to be displayed.

Will there be any problems with editing text because of the
nominal presence of two glyphs?

It may be possible to eliminate the zero extent glyph by ligating it
with another glyph.  What problems would this cause to editing text?


A second possible solution is for rearrangement to put the mai kang lai
glyph in the shifted position and to use a context glyph substitution
(ideally in the abvs feature) to move it to the unshifted position when
appropriate. The mai kang lai glyph would be associated with the
character corresponding to the leftmost base-level glyph of the second
cluster, so possibly part of the base consonant, a preposed vowel, or
MEDIAL RA.  The only serious issue I can see is that one could not cut
and paste the mai kang lai character, but many applications already
make it impossible to cut and paste combining marks from their clusters.

Are there any other problems for editing text?

There may be portability issues with the GSUB lookups.  Complex
context substitutions might be ambiguous.  See my post (made about
the same time as this one) about subtle deviations from the literal
interpretation of the Microsoft GSUB & GPOS specifications, entitled
'Skipping Control for Attaching Marks using OpenType'.

A third possible solution has the tempting consequence of not requiring
any action by the SEA shaper to support MAI KANG LAI.  The shifting,
when required, would be done by a context glyph substitution (ideally in
the abvs feature) to move the glyph from the unshifted position to the
shifted position.  The mai kang lai glyph would then be associated
with the base consonant of the second cluster.  The only serious issue
I can see is that one could not cut and paste the mai kang lai
character, but many applications already make such actions impossible.

Are there any other problems for editing text? 

It does seem necessary to use sequences of lookups for this
rearrangement, as I explain in my post about subtle deviations
referenced above.


More information about the HarfBuzz mailing list