[Fribidi-discuss] Bug in wrapping in command-line fribidi?

Beni Cherniavsky cben at techunix.technion.ac.il
Mon Mar 24 14:25:03 EST 2003


Nadav Har'El wrote on 2003-03-24:

> By the way, I hope that whatever new solution fribidi will have for
> line-breaking, it won't have to process an entire logical line at once,
> because that's inefficient with huge lines (not that's very useful... but
> still...).
>
If we *do* process entire logical lines, I think the following should
work:

1. Resolve the level of each character.
2. Break up into lines in logical order.
3. Reoredr each line according to the levels.

Is it that simple or did I miss anything?

For processing huge lines (possible use case: consider an editor
working on a logical line that is so long that it doesn't fit the
screen) we only need to split step 1.  Perhaps this can be solved by
exposing a state structure (probably a stack of embedding levels?)
that can be passed from one line segment to the next.  How much
lookahead do we need to resolve the levels?  I'm afraid this can't
work at least because we need to guess the base direction.

Which reminds me, there should be a separate function exposed for just
auto-detecting the base direction.  This will allow the application to
intervene and supplement this auto-detection (currently I think it
sometimes involves re-running the whole log2vis process).  Custom
algorithms might benefit from a separate step that just detects the
character categories...

In general I think as many separate processing stages should be
exposed as possible.  For example a program might understand some
"higher-level protocol" and wants to express this knowledge to
fribidi.  I think that can be done by supplying initial levels for the
characters (and fribidi can increase them according to implicit rules
and explicit marks...).


Wait a moment!  The fribidi API was derived from a document describing
Mozilla's requirements, wasn't it?  Now mozilla does line splitting,
with varying fonts and perhaps kerning complications, higher-level
protocol (span dir=...), arabic shaping, etc. - a complete nightmare.
How do they do it?

[On varying fonts: I think that in some corner cases the points of
line-wrapping can't be determined until you know the visual order,
because of font/char-dependent spacing at segment boundaries;
TeX-style global line breaking with hyphenation probably introduces
even more complications...  I wonder whether anybody ever tackled this
and whether fribidi should take this into consideration.  I guess any
programmer would give up such precision in return for his sanity. :-]

-- 
Beni Cherniavsky <cben at tx.technion.ac.il>,
whose 12x CD burner works at 24x with cdrecord in linux - sheer magic!




More information about the FriBidi mailing list