[HarfBuzz] Hebrew composition to presentation forms

Jonathan Kew jfkthame at googlemail.com
Sun Feb 2 01:45:16 PST 2014

On 2/2/14 01:13, Khaled Hosny wrote:
> Hi,
> Someone reported an issue with the hireq placement under the yodh with
> Ezra SIL font[1]. When I checked this, it seemed to be because HarfBuzz
> is composing U+05D9 + U+05B4 to U+FB1D and the font has a glyph for
> U+FB1D that has a not so good placement for hireq.
> I thought this composition is result of Unicode normalisation, so
> HarfBuzz is doing the right thing, but the comment in
> hb-ot-shape-complex-hebrew.cc:75 indicates otherwise. I’m no very sure,
> but I feel this kind of composition should fits more into fallback
> shaping like done with Arabic and not something to be done
> unconditionally, WDYT?

This is a difficult call. Note that U+FB1D does have a canonical 
decomposition to <U+05D9, U+05B4>; the comment in 
hb-ot-shape-complex-hebrew.cc relates only to the fact that these Hebrew 
presentation forms are excluded from the composition rules for NFC; 
thus, both NFC and NFD representations use the decomposed sequence.

Nevertheless, the two representations *are* canonically equivalent, and 
therefore it's appropriate that they should be rendered the same.

IMO, this is a font bug in Ezra SIL; if a font has positioning rules for 
yod + hireq, and also has a precomposed yod-hireq glyph, the two should 
look identical. A font that gives the impression that entering U+FB1D 
will result in one appearance, while <U+05D9, U+05B4> will result in 
something different, is misleading its users.

As for what harfbuzz should do: currently, it deliberately uses the 
precomposed Hebrew presentation-form glyphs, because there are many 
(generally older) fonts out there that lack good (or any) mark 
positioning rules, and so decomposed sequences look terrible. Using the 
presentation forms gives a much better result.

However, perhaps we should try to be more sophisticated, and do 
something like "compose to the presentation forms if the font doesn't 
have GPOS mark positioning; otherwise prefer decomposed sequences".


More information about the HarfBuzz mailing list