[HarfBuzz] how to detect missing glyphs e.g. for font substitition

Konstantin Ritt ritt.ks at gmail.com
Mon May 11 14:13:27 PDT 2015


2015-05-12 0:58 GMT+04:00 Louis Semprini <lsemprini at hotmail.com>:

>
>
> > Date: Mon, 11 May 2015 19:44:52 +0100
> > From: richard.wordingham at ntlworld.com
> > To: harfbuzz at lists.freedesktop.org
> > Subject: Re: [HarfBuzz] how to detect missing glyphs e.g. for font
> substitition
> >
> > On Mon, 11 May 2015 07:56:19 +0000
> > Louis Semprini <lsemprini at hotmail.com> wrote:
> >
> > > What is the most reliable and non-font-dependent way to detect
> > > whether a string being shaped by hb_shape() has led to any missing
> > > glyphs, and to identify where those glyphs occur?
> > >
> > > When I use the word "missing glyph" here I mean a glyph that is not
> > > what the user intended for that code point in that context, whether
> > > that be a little tofu box, a magical hex box, a space glyph (with or
> > > without zero advance), a diamond, or anything else that has
> > > substituted for the glyph that the user really wanted.
> >
> > In so far as the glyph is not just a function of the Unicode scalar
> > value, there need not be any indicator. There are a number of
> > fallbacks that may occur even in a well-constructed font:
> >
> > 1) Optional ligatures may be missing from the font.
> >
> > 2) Indic conjuncts may have a fallback form - Devanagari has two levels
> > of fallback.
> >
> > 3) As Konstantin mentioned, the default glyph for the base codepoint may
> > be returned if the requested variation sequence is not supported by the
> > font.
>
> Interesting, but I think I can safely say that 1, 2, and 3 would not fall
> in the category of "missing glyph" for my definition since at least
> something intelligible that relates to the original code point is presented
> (not tofu).  Yes it would be better to substitute a font with the
> capabilities described in 1-3 but that's not the main concern for the
> current question because at least the user still has a chance of figuring
> out the meaning.
>
> The main concern would be cases where what the font presents has no
> relation to the meaning of the original code point.  Can you think of any
> cases like that which would generate glyph indexes other than 0?
>
> > Also, how many glyphs a pair of regional indicators should yield is
> > quite undefined - it may be a choice of the font designer to render
> > them as letters.
>
> Interesting again, but that would be another case where what the user does
> see (either flags or letters) is still meaningful and not tofu, so it won't
> apply to the current question.
>

Actually, it does.
Whilst variation selectors are of Default_Ignorable_Code_Point property,
regional indicators are not (bug in Unicode? or it was done
intentionally?), which means invalid or unsupported VS sequence will be
replaced with a non-advancing whitespace, however invalid or unsupported RI
sequence will result in a run of .notdef glyphs (no flags, no letters).

Konstantin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20150512/24d97e4c/attachment.html>


More information about the HarfBuzz mailing list