[hackers-il] [Fwd: [bidi] Re: math in arabic -- w3c mathml group looking for contributions (fwd)]

Beni Cherniavsky cben at users.sf.net
Fri Nov 5 03:01:16 PST 2004


Omer Zak wrote:
> In the case of Hebrew, the answer, which comes to my mind, is to treat 
> all formulae like LTR text embedded into whatever text it finds itself in.
> 
> Now, are there practicing mathematicians in Hackers-IL, who can tell us 
> whether the above answer is satisfactory.  Or, whether more complicated 
> rules are needed (to cover pathological cases)?  This is the relevance 
> to Hackers-IL.
 >
If the formula contains no strong RTL chars, it should definitely be 
wholly LTR.

Sometimes you have Hebrew words as/in names of values inside formulae:

     $n_{FOO} - n_{BAR}$

     <msub><mi>n</mi> <mrow><mi>FOO</mi></mrow></msub>
     <mo>-</mo>
     <msub><mi>n</mi> <mrow><mi>BAR</mi></mrow></msub>

(itex2MML did the dirty job ;) should render as::

     n    - n
      OOF    RAB

which would be handled by the UBA just fine ("FOO" and "BAR" implicitly 
get a deeper level).

When you have no LTR chars between such names the UBA might get confused:

     $A - B$

     <mi>A</mi> <mo>-</mo> <mi>B</mi>

should render as::

     A - B

but the UBA would implictly make it all RTL which would result in::

     B - A

if we are not careful.  Conclusion: the UBA must not be applied to 
<mrow>s, only to selected elements like <mi> and some more.  Which is 
precisely what MathML already says__:

> For MathML token elements that can contain text (mtext, mo, mi, mn
> and ms), the implicit part of the Unicode bidirectional algorithm
> [Bidi] is applied when its content is rendered visually (i.e.
> characters are reordered based on character properties). The base
> directionality is left-to-right.

__ http://www.w3.org/TR/2003/REC-MathML2-20031021/
    chapter3.html#presm.bidi

Sometimes you need to embed text inside math (perhaps nested):

     $\{n^2 | \textrm{$n$ IS PRIME}\}$

     (itex2MML chokes, MathML is undecisive on how to do this but
     probably something like

     <mathml>...
       <mtext><mathml><mi>n</mi></mathml> IS PRIME</mtext>
     ...</mathml>

     would be right)

should render as::

       2
     {n  | EMIRP SI n}

which means the embedded text should considering as RTL.  It would be 
most convenient if embedded text would default to the direction outside 
of the containing formula but this might be awkward in an XML world. 
Requiring to explicitly mark such text with dir=RTL would be good
enough if it is in all other cases where we embed RTL text.  Note that 
this makes::

     <mathml>...
       <mi>n</mi> <mtext>IS PRIME</mtext>
     ...</mathml>

inappropriate (because it leaves no way to put $n$ at the right side of 
sentence) and using <mo> for text::

     <mathml>...
       <mi>n</mi> <mo>IS PRIME</mo>
     ...</mathml>

which is sometimes recommended__ by MathML (if one agrees that "IS 
PRIME" is a sort of operator) would be even worse.

__ http://www.w3.org/TR/2003/REC-MathML2-20031021/
    chapter3.html#presm.mixtextmath

Note that nesting of text inside math inside text requires hierarchical 
application of bidi.  You can't say "the whole paragraph is reordered" 
because math can't be described by flat order.  Instead you say "the 
paragraph is reordered keeping the formula as a black box but some parts 
of the formula are recursively reordered".  Which is saner anyway.

-- 
Beni Cherniavsky (who can only read email on week-ends)

python2.4 -m this


More information about the bidi mailing list