[HarfBuzz] Decomposition and soft dotted characters

Khaled Hosny khaledhosny at eglug.org
Sun Aug 4 19:33:08 PDT 2013


On Sun, Aug 04, 2013 at 05:54:46PM -0400, Behdad Esfahbod wrote:
> On 13-08-04 03:15 PM, Khaled Hosny wrote:
> > Hi All,
> > 
> > This TeX.SE question[1] shows an “interesting” effect of decomposition
> > and soft dotted characters; Gentium lacks ї (U+0457, CYRILLIC SMALL
> > LETTER YI) so HarfBuzz decomposes it into і (U+0456) and U+0308 so we
> > end up with three dots.
> > 
> > Not sure what to do here though, but IMO unless the font has lookups to
> > remove the soft dot,
> 
> If the font has i and U+0308 but not rules to compose them, it's a font bug.
> Avoiding the sequence is just working around a font bug that cannot be
> detected (easily).

Well, if the ‘dtls’ feature were part of the official OpenType registry
and fonts were to implement it, it would have been a nice way to support
soft dotting more systematically…

> > such decomposition is wrong and should be avoided
> > as it would just fool applications depending on HarfBuzz output to
> > determine if font fallback is needed or not since, IMO, using a fallback
> > font is better here.
> 
> But then if the user has the i,U+0308 sequence they would hit the bug still
> and there's no way to avoid that.

Good point.

> At any rate, something to reconsider if / when we add itemization support.
> Right now we don't tell apps whether the font supports the sequence or not.
> The app chooses the font and we just do our best at shaping with it.

One could check the cmap to see if the font supports the code point or
not (like Firefox does, AFAIK), but then he loses HarfBuzz’s fine
grained composition/decomposition, which I’m trying to avoid. But for
this particular case I’m convinced it is a font bug.

Regards,
Khaled



More information about the HarfBuzz mailing list