[HarfBuzz] Zero-width joiner has width
Simon Cozens
simon at simon-cozens.org
Sun Aug 2 09:45:09 PDT 2015
Here's an interesting one I came across when implementing Uyghur
hyphenation. The trick in hyphenated Uyghur is to use a ZWJ to ensure
that the last character of hyphenated Arabic morphemes remains in medial
form. However, when I send a Arabic + ZWJ + hyphen sequence to Harfbuzz,
it inserts a space between the hyphen and the Arabic:
> zwj = SU.utf8char(0x200d)
> text = "تئەۋ" .. zwj .. "-"
> SILE.shaper:shapeToken(text, SILE.font.loadDefaults({ font = "Amiri",
direction = "RTL" }))
{
{
codepoint = 16,
depth = -1.943359375,
height = 2.666015625,
name = "hyphen",
width = 3.681640625,
},
{
codepoint = 3,
depth = 0,
height = 0,
name = "space",
width = 2.9296875,
},
{
codepoint = 552,
depth = 2.24609375,
height = 6.2841796875,
name = "uni06CB",
width = 4.0087890625,
},
{
codepoint = 2226,
depth = 0.048828125,
height = 4.580078125,
name = "uni06D5.fina",
width = 3.7939453125,
},
{
codepoint = 3024,
depth = 0.0048828125,
height = 5.078125,
name = "uni0626.medi_BaaBaaInit",
width = 1.6845703125,
},
{
codepoint = 3732,
depth = 0.0634765625,
height = 4.8779296875,
name = "uni062A.init_BaaBaaIsol",
width = 3.193359375,
},
}
Making the case even more simple:
> SILE.shaper:shapeToken(zwj, SILE.font.loadDefaults({ font = "Amiri",
direction = "RTL" }))
{
{
codepoint = 3,
depth = 0,
height = 0,
name = "space",
width = 2.9296875,
},
}
I would have hoped that a zero-width joiner had... zero width.
More information about the HarfBuzz
mailing list