[HarfBuzz] vertical text for RTL scripts?
Phil M Perry
philperry at hvc.rr.com
Sun Jul 12 14:15:31 UTC 2020
(A week ago I sent an earlier version of this query, as an unregistered
user, and it doesn't seem to have shown up. I apologize if this is a
duplicate. I'm registered to receive the digest now.)
I have been using the Perl library HarfBuzz::Shaper to provide
HarfBuzz-based complex script support for PDF::Builder (PDF creation
library). LTR Latin script (English) and CJK script (Chinese) work as
expected when rendered vertically (TTB or BTT). I'm not sure, however,
how RTL scripts such as Hebrew and Arabic are supposed to be rendered
vertically. As I'm not sure that this mailing list will handle non-ASCII
text, just pretend that "lowercase text" is Hebrew and "UPPERCASE TEXT"
is English (Latin script, LTR). Let's say I have some INPUT order text
"abc defg HOT L BALTIMORE hijkl". It would of course be handled as three
separate calls to HarfBuzz::Shaper. For horizontal rendering, it comes
out "lkjih HOT L BALTIMORE gfed cba" (bidirectional), as expected,
starting at the right and moving leftwards, and the next text will be to
the left of "l".
Now, if I specify TTB direction, what should I see? Likewise, what
should BTT direction show? I know very little about RTL/bidi scripts,
and googling for examples gives ambiguous and conflicting information. I
realize that most scripts and languages are rarely written vertically,
except for East Asian (CJK) languages, but it would be nice to know that
the code is handling them correctly.
If you want to write Hebrew vertically, would you choose TTB or BBT?
That is, would you start (x,y) at the top and grow downwards, or at the
bottom and grow upwards? The examples I've seen suggest that the
rendering would be "lkjih HOT L BALTIMORE gfed cba" from top to bottom,
but would the "next write" position be at the top, and would you start
at the bottom of the page? From what experimenting I've done, it looks
like TTB and BBT directions may simply override whatever the script
wants to do naturally, and for TTB I only accidentally get "lkjih HOT L
BALTIMORE gfed cba" (with "l" at the top, and growing downwards!!).
I'm even less familiar with Arabic family scripts, but my assumption
would be that they follow the same rules regarding direction as Hebrew
would, and would have only the standalone form of character glyphs
(unconnected and no ligatures). I have no idea what a strongly connected
script like Devanagari should do when written vertically.
A related question: when writing vertically, do I need to explicitly
turn off ligatures, kerning, and anything else (-liga, -kern) or does
HarfBuzz know to do that automatically?
More information about the HarfBuzz
mailing list