[HarfBuzz] Fwd: Properly Rendering (Tamil-)Brahmi Short E/O

Richard Wordingham richard.wordingham at ntlworld.com
Thu Jul 19 23:37:10 UTC 2018


On Fri, 20 Jul 2018 00:04:07 +0200
Vinodh Rajan <vinodh at virtualvinodh.com> wrote:

> I would like to know how to properly render/encode ligatures that
> result from the following sequences in Brahmi.
> 
> CONSONANT + VOWEL-SIGN-E + VIRAMA
> CONSONANT + VOWEL-SIGN-O + VIRAMA
> 
> From ยง14.1 in the UCS:
> 
> << Tamil Brahmi pulli (virama) had two functions: to cancel the
> inherent vowel of consonants; and to indicate the short vowels [e]
> and [o] in contrast to the long vowels [e:] and [o:] [...]  As a
> consequence, in Tamil Brahmi text, the virama is used not only after
> consonants, but also after the vowels e (U+1100F, U+11042) and o
> (U+11011, U+11044). This pulli is represented using U+11046 brahmi
> virama >> (Pulli := Dot)
> 
> I tried using the 'rlig' feature to represent the sequence and it only
> works for standalone syllables. If they are followed by another
> consonant, HB recognizes the sequence as illegal and keeps inserting
> a dotted circle before the Virama.
> 
> [image: 2018-07-19.png]
> (๐‘€“๐‘„๐‘†    ๐‘€“๐‘„๐‘†๐‘€“๐‘€ผ :: KO+Virama and KO+Virama+KU)
> 
> As you can see above, while the standalone KO forms the proper
> ligature, HB breaks the syllable before the Virama in the second
> sequence.
> 
> I am not sure if this is due to HB doing incorrect clustering of
> Brahmi syllables or just incorrect
> OT features in my font.  In any case, CONSONANT + VOWEL-SIGN-E/O +
> VIRAMA must be recognized as a valid syllable cluster in Brahmi (at
> least with the 'oty' - Old Tamil locale).

Brahmi is one of the scripts shaped by the Universal Shaping Engine
(USE) and its copies, as in HarfBuzz.  Like Tai Tham, Brahmi may now get
an asterisk in the Microsoft description, for the script does not follow
the USE grammar.  A syllable ending in virama may not include a
dependent vowel!  If my memory serves me right, this is ironic, as
Andrew Glass had a lot to do with the original Brahmi proposal!

The quick solution at the font level is to use a feature invoked after
the dissolution of syllable boundaries, such as psts, to convert
sequences such as <gSIGN_O, g25CC, gVirama> to <gSIGN_O, gVirama>.
This isn't quite Unicode compliant, as it will also affect character
sequences such as <SIGN_O, U+25CC, VIRAMA>. (One can work round this
in a HarfBuzz renderer, but the only solution I know fails abysmally
with Uniscribe/DirectWrite.)  You may have to move other substitutions
from xxxf features to xxxs features.

Richard.


More information about the HarfBuzz mailing list