[HarfBuzz] Features, masks and glyph attribution.

Alexander Sabourenkov llxxnntt at gmail.com
Mon Jan 28 11:42:02 PST 2013


Hello.

I'm stuck in understanding more general aspects of HarfBuzz and
shaping; reading and tracing the code went into diminishing returns
mode.

The task I'm struggling with is - after calling hb_shape(), map each
resulting glyph to the unicode code point  in the initial string that
caused the glyph in question to be emitted.

I'm sorry if that doesn't parse, let me explain. I have an UCS-2
string, without surrogates, where each character is associated with
some data structure. Let's say it's just an integer value, an index of
that character in the string.

hb_shape() converts that to a sequence of glyphs. How do I know which
glyph correspond to which character [index]?

I don't think even the order of the glyphs is the same that of
characters for RTL scripts. I suspect that one character may result in
arbitrary number of glyphs.

Reading the code let me to a hypothesis that user-defined hb_feature_t
values can somehow end up in hb_glyph_info_t::mask (no obvious way to
extract though).
However, further work ended in that I'm not sure that's possible at all.

Can someone please enlighten me on:

 - is it possible at all in stock HarfBuzz? how?
 - if not, what would be a reasonable way to hack that in, API-wise?
 - what would be prospects of such a patch being merged?

-- 

./lxnt



More information about the HarfBuzz mailing list