[HarfBuzz] Clustering and Hit Detection

Behdad Esfahbod behdad at behdad.org
Thu Apr 11 10:51:42 PDT 2013


Hi Richard,

Your concerns are certainly valid ones.  We have not addressed justification
in HarfBuzz yet.  However, you are right that some information needed is lost
when we do reordering.  It's unfortunate but that's how OpenType works.  The
SARA AM case we may be able to improve, but there's not much I can think of
doing for the general case.

behdad

On 13-04-06 11:53 AM, Richard Wordingham wrote:
> Dear List,
> 
> I understood that one of the reasons for using a shaping engine to
> sequence glyphs rather than a sequence of substitutions was so that
> selection of glyphs at the visual level could select the appropriate
> characters in backing store.  However, it seems that default extended
> grapheme clusters, and their extensions by a subjoined consonant, are
> reported as an indivisible cluster.  This seems to make it very
> difficult to work back from glyph to character.  Is there, therefore,
> any reason not to effectively implement lower level reorderings as
> substitutions of the form a b -> b a?
> 
> I do see a related problem.  Thai has a justification mode ('Thai
> justification') in which spaces are increased between letters.  Preposed
> and postposed vowels count as letters.  An issues appears to arise with
> words like น้ำ <U+0E19 THAI CHARACTER NO NU, U+0E49 THAI CHARACTER MAI
> THO, U+0E33 THAI CHARACTER SARA AM>.  LibreOffice 4.0.2.1 justifies
> this as though it were composed of two clusters, <U+0E19, U+0E4D THAI
> CHARACTER NIKHAHIT, U+0E49> and <U+0E32 THAI CHARACTER SARA AA>.
> However, HarfBuzz declares the word to be one cluster.  How is a
> renderer using HarfBuzz expected to perform Thai justification on such
> a word?
> 
> There *may* be an even worse issue with Tai Tham.  If that is to use
> Thai justification, preposed vowels (general category Mc) and
> following vowels (also Mc) will need to have gaps inserted between them
> and the consonant, but HarfBuzz gives no clue as to where the gap
> occurs.  I don't know whether Thai justification should occur with Tai
> Tham; pre-Unicode fonts that I have seen generally use ASCII character
> codes for some of the glyphs, and that may inhibit Thai justification.
> 
> Richard.
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
> 

-- 
behdad
http://behdad.org/



More information about the HarfBuzz mailing list