[HarfBuzz] Unicode vs glyphs

Tue Jun 14 07:54:10 PDT 2011

On Tue, 2011-06-14 at 16:37 +0200, Eduardo Castiñeyra wrote:
> That's a good question. We have a render engine that expects the string 
> to be passed in Unicode format, as all the drawing methods are based on 
> Unicode indices. Changing it into a glyph based engine is not cheap and 
> I have to provide a good reason to do so.
I.c, that sucks :P
> 
> We never needed to work with glyph indices before, for instance, we were 
> using FriBidi to convert from logic to visual, and it returned Unicode 
> indices (fribidi didn't even need to know the font).
That's because the BiDi algorithm has nothing to do with the font.
> But now, with HB or ICU we have to deal with them. The only explanation is that those glyphs 
> which have no Unicode index are not present in arabic nor other BiDi 
> scripts, and maybe that's why we can take FriBidi returned glyphs are 
> the same numbers than the Unicode indices.
Glyph indexes are completely different than unicode indexes, one is the
representation of the char in unicode and the other is index in the
specific font file to a glyph.

Fribidi does *not* work with glyphs, but only with unicode codepoints.

As for "indexes that don't have a matching code point" first and
foremost, this appears in most (all) of the Indic scripts, special
ligatures, character variants and etc. You really can't avoid them in a
complete implementation.
> 
> > I mostly use the Microsoft pages, but if you can find something else,
> > please let me know.
> 
> Could you give me those MS links?
opentype stuff: http://www.microsoft.com/typography/otspec/
typography in general: http://www.microsoft.com/typography/default.mspx

--
Tom.