[HarfBuzz] Fwd: Harfbuzz with linebreaking

Simon Cozens simon at simon-cozens.org
Tue Jun 14 02:53:22 UTC 2016


On 14/06/2016 12:42, Kelvin Ma wrote:
> What I need is something to bridge that gap between the 1-line of
> unbroken text that harfbuzz generates, and the fragments I need to be
> able to assemble a multi-line paragraph.

Right. You need that, but it's not Harfbuzz's job. Write some code. :-)

> The only way to get these
> pieces is to find the spots in the shaped text where the whole line can
> be shaped in two pieces with an identical result.

Wrong. What you need to find is the potential line breaks. That's not a
shaping issue specifically; it's a text issue, and needs to be dealt
with at the text level.

Taking the example of a ligature, it *is* allowable to break (with
hyphenation) in the middle of a ligature like "fi". Indeed, your
justification engine might decide, for the good of the rest of the lines
in the paragraph, that this is the best place to break. If all you are
dealing with is the glyph output from Harfbuzz, you won't be able to
spot that breakpoint.

Once you get into non-Latin scripts, things get worse. Finding
breakpoints is a matter that depends entirely on the rules of the script
or language that your text is written in. Right now I'm fighting with
Javanese, where line breaks are permissible at the end of syllables. You
need to parse the text, not the glyphs, to determine the appropriate
breaks. Like others have said: use ICU or similar.

And so you need to deal with two sets of information at the same time:
the text-level information about breaks, and the shaper-level
information about glyphs. This is why Harfbuzz returns you an index into
your text string, so that you can keep those two sets of information in
sync. The hard part of writing a typesetting system is dealing with the
interplay between those two representations of a text.

It took me quite a while to get my head around this, and a lot of help
from others. You can see the record of me banging my head against this
particular wall at https://github.com/simoncozens/sile/issues/179 ,
which has a nice explanation of the issues involved.


More information about the HarfBuzz mailing list