[HarfBuzz] vertical text for RTL scripts?

Phil M Perry philperry at hvc.rr.com
Sun Jul 12 14:15:31 UTC 2020


(A week ago I sent an earlier version of this query, as an unregistered 
user, and it doesn't seem to have shown up. I apologize if this is a 
duplicate. I'm registered to receive the digest now.)

I have been using the Perl library HarfBuzz::Shaper to provide 
HarfBuzz-based complex script support for PDF::Builder (PDF creation 
library). LTR Latin script (English) and CJK script (Chinese) work as 
expected when rendered vertically (TTB or BTT). I'm not sure, however, 
how RTL scripts such as Hebrew and Arabic are supposed to be rendered 
vertically. As I'm not sure that this mailing list will handle non-ASCII 
text, just pretend that "lowercase text" is Hebrew and "UPPERCASE TEXT" 
is English (Latin script, LTR). Let's say I have some INPUT order text 
"abc defg HOT L BALTIMORE hijkl". It would of course be handled as three 
separate calls to HarfBuzz::Shaper. For horizontal rendering, it comes 
out "lkjih HOT L BALTIMORE gfed cba" (bidirectional), as expected, 
starting at the right and moving leftwards, and the next text will be to 
the left of "l".

Now, if I specify TTB direction, what should I see? Likewise, what 
should BTT direction show? I know very little about RTL/bidi scripts, 
and googling for examples gives ambiguous and conflicting information. I 
realize that most scripts and languages are rarely written vertically, 
except for East Asian (CJK) languages, but it would be nice to know that 
the code is handling them correctly.

If you want to write Hebrew vertically, would you choose TTB or BBT? 
That is, would you start (x,y) at the top and grow downwards, or at the 
bottom and grow upwards? The examples I've seen suggest that the 
rendering would be "lkjih HOT L BALTIMORE gfed cba" from top to bottom, 
but would the "next write" position be at the top, and would you start 
at the bottom of the page? From what experimenting I've done, it looks 
like TTB and BBT directions may simply override whatever the script 
wants to do naturally, and for TTB I only accidentally get "lkjih HOT L 
BALTIMORE gfed cba" (with "l" at the top, and growing downwards!!).

I'm even less familiar with Arabic family scripts, but my assumption 
would be that they follow the same rules regarding direction as Hebrew 
would, and would have only the standalone form of character glyphs 
(unconnected and no ligatures). I have no idea what a strongly connected 
script like Devanagari should do when written vertically.

A related question: when writing vertically, do I need to explicitly 
turn off ligatures, kerning, and anything else (-liga, -kern) or does 
HarfBuzz know to do that automatically?



More information about the HarfBuzz mailing list