[HarfBuzz] Unicode vs glyphs

Behdad Esfahbod behdad at behdad.org
Tue Jun 14 08:29:29 PDT 2011


Hi Eduardo,

For a general understanding of characters and glyphs, I found this document
from 1998 helpful:

  http://std.dkuug.dk/jtc1/SC2/wg2/docs/TR%2015285%20-%20C027163e.pdf

In general, the mapping between characters and glyphs is not one-to-one.  Even
for Latin, etc, when you get into fine typography, the mapping is not trivial.

What FriBidi does is called "shaping to presentation forms".  It works for
Arabic, only because for historical reasons Unicode had to encode the
presentation forms of most of the Arabic alphabet.  That approach doesn't work
in general.

behdad

On 06/14/11 07:46, Eduardo Castiñeyra wrote:
> Hi,
> 
> I do not understand why the function hb_shape returns glyphs instead of
> unicode characters. I understand that different fonts could have different
> glyph indices for a given symbol, but Unicode does not depend on the fonts, so
> it would be more usefull to return Unicode indices.
> 
> For instance, I used fribidi in the past and it returned the visual string in
> Unicode format. I think that hb_shape does pretty the same thing. The only
> reason I imagine is that not all the glyphs may have an index in the Unicode
> table, so it would be imposible to represent them all as Unicode indices, but
> I don't know whether it is true.
> 
> Is there a way to convert the given glyphs into Unicode indices?
> 
> Is there any literature about layout processing for not having to ask these
> kind of questions in the list? I'm very lost on what to expect from i18n
> libraries.
> 
> Thanks a lot.
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
> 



More information about the HarfBuzz mailing list