[HarfBuzz] Why mark glyphs are skipped when MarkBasePos matching?

Richard Wordingham richard.wordingham at ntlworld.com
Mon Mar 25 10:31:02 UTC 2019


On Mon, 25 Mar 2019 17:29:33 +0900
Shusaku KIMURA <skimura0 at gmail.com> wrote:

> Dear developer team.
> 
> I have a question about MarkBasePos matching behavior.  It seems that
> all glyphs classified as mark by GDEF Glyph Class
> Definition Table are skipped when MarkBasePos matching.
> 
> For example, the following glyph sequence is input.  B is a glyph in
> Base Coverage Table, M is a glyph in Mark Coverage Table
> and m is a mark glyph defined by GDEF.
> 
>     B m M
> 
> The implementation of HarfBuzz finds B as base of M even though there
> is m between B and M. If m is not a mark glyph, B and M
> are not matched. I tried to find out the reason of the behavior in the
> specification, but could not.  Can anyone teach me why
> such behavior is needed?

Multiple, non-interacting marks may be applied to a base.  For example,
the Vietnamese vowel plus tone mark U+1EAD LATIN SMALL LETTER A WITH
CIRCUMFLEX AND DOT BELOW could be processed as the sequence of glyphs
for 'a', U+0323 COMBINING DOT BELOW, U+0301 COMBINING ACUTE ACCENT, and
the corresponding capital letter similarly.

One could have two lookups, one ignoring marks above, and one ignoring
marks below, but that requires GDEF to categorise marks as marks above
or as marks below.

Now, one could position U+0301 relative to U+0323, but the relative
position would differ depending on whether the base was 'a' or 'A',
which increases the complexity of the lookups - one would need a
context lookup to select between a mark-to-mark positioning appropriate
for 'a' and a mark-to-mark positioning appropriate for 'A'.

All in all, it is simplest to just ignore intervening marks in mark to
base positioning.  The downside is that one then needs to know which of
two contradictory mark positioning lookups applies - that may have been
intended to be a trade secret.  A major complaint about OpenType is
that the syntax of font files is defined, but not their semantics. 

Richard.


More information about the HarfBuzz mailing list