[HarfBuzz] vertical text for RTL scripts?

Eli Zaretskii eliz at gnu.org
Mon Jul 13 16:45:48 UTC 2020


> Cc: harfbuzz at lists.freedesktop.org
> From: Phil M Perry <philperry at hvc.rr.com>
> Date: Mon, 13 Jul 2020 11:11:51 -0400
> 
> Eli, I realize that (except for Chinese, Japanese, and possibly Korean), 
> text is normally written horizontally (LTR or RTL). Vertical text is for 
> special uses such as signage and advertising.

Yes, I understand that, and was replying with that in mind.

> Anyway, I'm still not sure what the convention is for writing vertical 
> text in RTL languages. There's not much discussion of this online, 
> except for "I want to get a Hebrew tattoo down my spine saying 'daughter 
> of Jehovah' -- which way will read correctly?" The convention for LTR 
> scripts is to start at the top and grow downwards, which is like taking 
> the original LTR coordinate system and rotating it 90 degrees clockwise 
> (with individual letters rotated back). The next line (column) is to the 
> LEFT.

Yes, agreed.

> For RTL, my sources suggest that the last letter input (first one 
> read)

This is fundamentally incorrect: both input from keyboard and reading
are done in the same order, even for RTL languages.  The only order
which is reversed for RTL languages is the left-to-right order on
display: the first RTL letter read is generally the rightmost, unlike
with LTR scripts.

I think the above observation is important, because I'm guessing it is
the basis of your confusion regarding the vertical layout.  In the
vertical layout, the left vs right issue no longer exists (at least as
long as we are talking about a single column), so the distinction
between LTR and RTL scripts also disappears.

Therefore:

> should be at the TOP of the text column, which means rotating the 
> original horizontal coordinate system 90 degrees COUNTERclockwise. For 
> TTB of a RTL script, it is like a clockwise rotation, with the first 
> input letter at the top, but reading from the bottom/original right. 

No, the first input letter is at the top, and the first one you read
is also at the top.

> Embedded LTR text is read TTB. For BTT, it is like a COUNTERclockwise 
> rotation, with the first input letter at the bottom, reading from 
> top/original right. Unfortunately, this leaves embedded LTR text 
> backwards from what would be expected

No.  Embedded LTR text will also be laid out TTB, i.e. without
reordering it.

In short, in vertical layout there's no bidi reordering at all: both
LTR and RTL characters are displayed in the logical order, top to
bottom.  Technically, I think this happens because bidi reordering per
UAX#9 works on the line level, so when each character is a separate
line, reordering has no effect.

> Also, for BTT, is it correct that the next line (column) is to the
> RIGHT?

Yes, I believe the columns should progress from right to left for the
RTL text (modulo the base paragraph direction issue, which your
description completely ignores, so my assumption is that you are
talking about RTL text in a right-to-left paragraph and LTR text in a
left-to-right paragraph, not the other way around).

> Finally, I tried some English (LTR Latin) text vertically with "field" 
> in it, WITHOUT explicitly turning off ligatures (-liga), and it kept the 
> "f" and "i" separate (good)... does this mean that HarfBuzz officially 
> knows not to do ligatures with vertical text? Kerning doesn't appear to 
> be a problem, either.

That's something for the HarfBuzz experts here to answer; I'm not such
an expert.

HTH


More information about the HarfBuzz mailing list