[HarfBuzz] script segmentation

Richard Wordingham richard.wordingham at ntlworld.com
Thu Feb 15 20:13:43 UTC 2018


On Wed, 14 Feb 2018 11:01:55 +0700
Martin Hosken <mhosken at gmail.com> wrote:

> 1. Do we have a standard algorithm for this?
Well, the obvious fix is a per-block default script, just as some
unassigned characters have a default property of AL or R.  The problem
comes with Indic scripts, though a default of consonant will often work.

> 2. Do we want one?
I suspect you're the expert.  How well does MultiScribe work on
Windows?  On Apple systems, the answer for ordinary users is to use
AAT, and I suspect that will soon extend to Linux applications courtesy
of HarfBuzz.  I don't know if that would work on ChromeOS.

On the other hand, in the free world it would be nice to test out
OpenType fonts.  Several applications already use a Linux sharable
object for HarfBuzz, and one could in principle replace them with a
version that already included the new characters.  LibreOffice is one
such application.

> 3. How can we make it more future resilient?

A mechanism that ascribes properties to PUA points could be extended to
unassigned characters in general.

In principal, the USE grammar policeman is a problem.  Combining marks
can usually be identified by an OpenType glyph category of 'mark', but
unassigned combining marks are unlikely to get a security clearance, so
the obvious relaxation will not work.

Richard.


More information about the HarfBuzz mailing list