[HarfBuzz] MS/Symbol cmap subtables

Eric Muller emuller at amazon.com
Mon Jan 15 02:25:15 UTC 2018

It seems that with a font that has only a 3, 0 cmap subtable (and may be 
some macintosh subtables), then HB will automatically do the shift by 
F000 (in the function get_glyph_from_symbol) for code points below 
U+00FF that are not mapped by the subtable.

It is clear that when U+0041 A is set with a symbol font, then that 
U+0041 has actually the semantics of a PUA code point, and certainly 
should not be treated as an "A". That's the whole point of a 3,0 cmap 

Consider an HTML page. The font-family is only a request and there is no 
guarantee that the actual font will or will not be a symbol font. Thus 
the semantic of the HTML page can change depending on the browser 
environment. Outside a browser, it seems that the safe treatment is 
therefore to consider all code points below U+00FF as PUA, which is 
clearly not tenable. So in that environment, I think that the shift 
should not be done. Of course, U+F041 should work.

Note that behavior of Word 2016 on Windows is actually more elaborate: 
enter U+0041, and set it with a non-symbol font; copy/paste or save to a 
text file, and the result is U+0041; but set this A in a symbol font, 
and copy/paste or save to a text file, and the result is U+F041.

I think that the shift should be controllable by the client, rather than 
systematically applied. I don't have a strong opinion about the default 
behavior (i.e. when HB's client does not specify whether the shift 
should be done or not).



