[HarfBuzz] Control characters inside ligatures
Khaled Hosny
khaledhosny at eglug.org
Mon Dec 7 01:53:36 PST 2015
On Mon, Dec 07, 2015 at 09:14:19AM +0100, Behdad Esfahbod wrote:
> On 15-12-05 03:31 PM, Khaled Hosny wrote:
> > Hi,
> >
> > I just noticed that when there is a control character between character
> > that form a ligature, there is a zero width space after the ligature
> > with a cluster value of the first character in the ligature, for
> > example:
> >
> > $ hb-unicode-encode U+0066,U+200C,U+0069 | hb-shape amiri-regular.ttf
> > [f_i=0+1064|space=0+0]
> >
> > or
> >
> > $ hb-unicode-encode U+0066,U+00AD,U+0069 | hb-shape amiri-regular.ttf
> > [f_i=0+1064|space=0+0]
> >
> > This is rather surprising as I was expecting the control character to be
> > consumed inside the ligature and only the ligature glyph would remain. I
> > think the current behaviour makes mapping glyphs to text indices harder
> > in this case. WDYT?
>
> I don't think it makes any difference. It's a zero-width glyph, so it
> contributes nothing to the cluster as a whole, so you still have to divide the
> sum of the widths of the glyphs by the number of cursor stops and that works
> the same both ways. No?
I was thinking in terms of line breaks, since the soft hyphen is a break
opportunity I need to know that the sequence <f><soft hyphen><i> became
the <fi> glyph, but I’m not sure how to do that with the extra glyph
with the same cluster value. But may be I’m looking to it from the wrong
angle, ad I simply need to reshape the left side (probably with a real
hyphen) and the right side and just break the line there.
Regards,
Khaled
More information about the HarfBuzz
mailing list