[HarfBuzz] MS/Symbol cmap subtables
Eric Muller
emuller at amazon.com
Mon Jan 15 02:25:15 UTC 2018
It seems that with a font that has only a 3, 0 cmap subtable (and may be
some macintosh subtables), then HB will automatically do the shift by
F000 (in the function get_glyph_from_symbol) for code points below
U+00FF that are not mapped by the subtable.
It is clear that when U+0041 A is set with a symbol font, then that
U+0041 has actually the semantics of a PUA code point, and certainly
should not be treated as an "A". That's the whole point of a 3,0 cmap
subtable.
Consider an HTML page. The font-family is only a request and there is no
guarantee that the actual font will or will not be a symbol font. Thus
the semantic of the HTML page can change depending on the browser
environment. Outside a browser, it seems that the safe treatment is
therefore to consider all code points below U+00FF as PUA, which is
clearly not tenable. So in that environment, I think that the shift
should not be done. Of course, U+F041 should work.
Note that behavior of Word 2016 on Windows is actually more elaborate:
enter U+0041, and set it with a non-symbol font; copy/paste or save to a
text file, and the result is U+0041; but set this A in a symbol font,
and copy/paste or save to a text file, and the result is U+F041.
I think that the shift should be controllable by the client, rather than
systematically applied. I don't have a strong opinion about the default
behavior (i.e. when HB's client does not specify whether the shift
should be done or not).
Thoughts?
Thanks,
Eric.
More information about the HarfBuzz
mailing list