[HarfBuzz] Interaction of shaping and line-breaking
jjc at jclark.com
Thu Feb 13 04:36:51 CET 2014
On Thu, Feb 13, 2014 at 5:48 AM, Behdad Esfahbod <behdad at behdad.org> wrote:
> In short: You shape the whole paragraph using HarfBuzz, then create a
> line-break iterator with ICU and walk the text and glyph string together
> find the best break opportunity to fill the line. That can be
> mid-HarfBuzz-cluster. After that, you reshape the line, but can continue
> the original shaped-paragraph for the next lines.
> Can be made twice faster, yes. We're not quite there yet.
You have hinted before that you have something cooking here. I am very
curious what it is. I am interested less in the potential speed-up, and
more in the ability to do Knuth/Plass line-breaking correctly (which, of
course, requires you to be able to determine the exact lengths of lines
between each pair of potential breakpoints, before deciding which
breakpoints to use).
In a little experimental project I've been working on (in pure JS, so not
using harfbuzz at the moment), I have been playing with the following
approach. I maintain two flags on each glyph:
- affectedByPreceding - true iff the shaping of this glyph, or any glyph
following this glyph, has been affected by any glyph preceding this glyph;
- affectedByFollowing - true iff the shaping of this glyph, or any glyph
preceding this glyph, has been affected by any glyph following this glyph.
If affectedByPreceding is false on a glyph, then that glyph and the
following glyphs have been shaped just as if there was nothing before that
glyph. Similarly, if affectedByFollowing is false on a glyph, then that
glyph and the preceding glyphs have been shaped just as if there were
nothing after the glyph.
As lookups modify the glyph sequence, then these flags are set based on the
kind of lookup and the context. I can then use these flags to determine how
much of the line, if any, needs to be reshaped. For example, if the first
character after a line-break has the affectedByPreceding flag false, then I
don't need to do any reshaping of the start of the line after the
This approach is not perfect: I can come up with artificial examples where
it will fail (in the sense of giving a different result from reshaping the
entire line). But it seems to work OK, at least for the simple examples
So I was wondering whether your approach was similar to this, or whether
you are doing something completely different (such as some sort of static
analysis of the tables?).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the HarfBuzz