[HarfBuzz] Dotted Circles in Tai Tham

Roozbeh Pournader roozbeh at google.com
Tue Feb 24 09:26:41 PST 2015


On Tue, Feb 24, 2015 at 5:03 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> Are we still left with IndicSyllabicCategory.txt as the only
> functional definition of the properties?


Not necessarily. USE seems to use a combination of Indic syllabic, Indic
positional, and general categories, with some codepoints as exceptions.
HarfBuzz has been using some very similar techniques too, with tables
automatically derived from the Unicode data files and then some exceptions
in code.


> 1. Is <consonant><dependent_vowel>_<dependent_vowel> an allowed context
> for a 'Consonant_Medial' if it is allowed for an invisible stacker plus
> consonant?
>
> 2. Is <consonant><dependent_vowel>_# an allowed context for a
> 'Consonant_Medial' if it is allowed for an invisible stacker plus
> consonant?
>
> 3. Are they allowed contexts for 'Consonant_Subjoined' if they are
> allowed for an invisible stacker plus consonant?
>

They could be, as soon as we have evidence that there is need for allowing
them (if we don't allow them at the moment). Generally, give us the
character sequence that should work and doesn't, and why your sequence is
correct according to Unicode encoding of a script, and HarfBuzz will get
the patterns fixed to allow the character sequence.


> Correction: I checked as I wrote and see that the USE specification was
> released yesterday.  If the blog page is correct, the Universal Shaping
> Engine rejects the phonetic ordering of the Tai Tham encoding model.
> The word /pɛːt/ 'eight' must be encoded <PA, SAKOT, DA, SIGN AE>!  I
> shall be studying the specification today.  At first sight the USE
> appears to reject the current encoding system.
>

USE is really in draft mode IMO. There are several small details that it
doesn't consider properly when it comes to Unicode. With HarfBuzz, we have
been trying to be both closer to Unicode's definition of things, and more
accepting of different sequences (i.e. show less dotted circles).

I specifically don't like USE's hard requirements of character ordering,
which they
may sometimes be doing against Unicode recommendations. If you find
examples when the Unicode standard recommends another order than USE does,
please tell both the HarfBuzz community and the USE authors (contact Andrew
Glass).

An important question for U+1A7A, U+1A7B and U+1A7C is:
>
> 4. May a 'syllable modifier' be followed by something other than a
> syllable modifier?  The description implies not, which reduces the
> useful of what could have been a useful waste bin taxon, sweeping up
> all the pure killers.
>

My expectation in defining the new Syllable_Modifier was they they would
typically occur at the end of syllables, but not necessarily the very very
end. For example, I wouldn't be surprised if a Visarga or Bindu character
follows them. (USE maybe more restrictive, but that's their problem.)

Still, I may have miscategorized the characters discussed here. They are
quite under-documented in the standard and the proposals anyway. Any help
in understanding them better would be appreciated, especially including
real-world interesting cases that may not fit in the current model. I can
help with getting the clarification into Unicode and fixes into HarfBuzz.

Also, the whole Syllable_Modifier category is sometimes just a catch-all
for some of the weird or underdocumented things that don't easily fall into
any other class. Feel free to suggest splits of the category.

> Please take a look and send me or UTC your suggestions (or file bugs
> > at https://github.com/roozbehp/unicode-data/issues). If there was
> > still a need to change something in HarfBuzz, we can do that too.
>
> By 'sending to the UTC', are you suggesting anything more than a bug
> report or document submission?


No. A bug report or a document submission are the best ways forward.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20150224/a23f25a8/attachment.html>


More information about the HarfBuzz mailing list