[HarfBuzz] how to detect missing glyphs e.g. for font substitition

Konstantin Ritt ritt.ks at gmail.com
Mon May 11 08:06:02 PDT 2015


2015-05-11 18:24 GMT+04:00 Louis Semprini <lsemprini at hotmail.com>:

>
> > Hi Louis,
> >
> > In my font engine I start by doing font selection depending on the
> presence of glyphs and encoding before call harfbuz to shape the string.
> The process is tedious but simple: break the text into text runs by trying
> to find changes in the properties of the text stream:
> > 1. Split text into paragraphs (LayoutText)
> > 2. Split text into fonts
> > 3. Split paragraphs into ranges (LayoutParagraph)
> > 4. Split ranges into possible fonts (I try to keep the number of fonts
> to a minimum)
> > 5. Split ranges into lines / words if needed
> >
> > Then I shape each run with harfbuz. Each run can have a different font.
> > I’m not saying it’s the perfect solution to the problem but it worked
> fine for me and for now I don’t think I have encountered cases where
> harfbuz was missing a glyph in the end. I think that having a "missing
> glyph” callback would not work for me as it would already be too late and
> that I would have to restart the text layout and font selection from the
> beginning.
>
> Thanks Sebastian.
>
> I assume when you say "depending on the presence of glyphs" you mean that,
> at some point, you are making an individual call for each code point of the
> input text in order to check whether the proposed font has that code point
> or not, correct?
>
> I can definitely understand the idea that a "missing glyph callback" would
> come too late for some layout engines, however I still think it's useful in
> some cases to know from the output of Harfbuzz whether there were any
> missing glyphs, especially if speed is important, the missing glyph case is
> rare, and one wants to avoid the extra, expensive per-code-point cmap
> lookup in the common case of no missing glyphs.
>
> So I am hoping some of the Harfbuzz folks here can address the original
> question about whether glyph index 0 always means "missing", and whether
> glyph index 0 is the sole and only way that Harfbuzz indicates "missing."
> Or if there is some other way to interpret the output of hb_shape() to look
> for missing glyphs.
>

A short answer: glyph index 0 is assumed for glyphs not supported by the
font.

The OT shaper returns just what you gave it via the font_get_glyph(..)
callback -- FT basically returns 0 for missing glyph, however some
implementations might use special values to indicate input/encoding errors
(\sa Pango).
Other shapers (ones that do not use font_get_glyph(..) callback) could
potentially return anything (i.e. CT could use the index'es high byte for
its internal purpose; Uniscribe has an API to override the .notdef glyph's
value), though HB expects them to return 0 for missing glyph(s) anyways.

Of course, it would be nice if hb_shape() can report the process'es
additional status/info anyhow, so one can avoid doing an unnecessary
post-shaping checks (for missing glyph(s) presence, cluster re-mappings,
broken VS or RI sequence(s) presence, etc).

Regards,
Konstantin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20150511/d0cee77f/attachment-0001.html>


More information about the HarfBuzz mailing list