[Bidi] Internationalized Resource Identifires (IRI)

Beni Cherniavsky cben at users.sf.net
Fri Aug 20 07:31:29 PDT 2004


Behnam Esfahbod wrote:
> Hi list,
> 
> You may be interested in bidi-related issues in this page: 
> http://www.w3.org/International/iri-edit/

 From that page:

> Bidirectional IRIs MUST be rendered in the same way as they would be 
> rendered if they were in an left-to-right embedding
And how do they suggest to handle an RTL sentence containing an IRI? 
Sure, an embedding is needed but frequently it will not be there.  In
pratice IRIs *are* going to be displayed RTL from time to time.  I hope 
they are aware that any security issues resulting from this are going to 
happen.

> The Unicode Bidirectional Algorithm ([UNI9]Davis, M., The
 > Bidirectional Algorithm, March 2004., Section 4.3) permits
 > higher-level protocols to influence bidirectional rendering. Such
 > changes by higher-level protocols MUST NOT be used if they change the
 > rendering of IRIs.
I don't think this clause is useful or implementable.  Use cases for 
changing IRI rendering [with higher-level protocols]:

- Texts like their Bidi Examples page which deliberately change
   rendering of IRIs in order to *talk about* IRI bidi.  It is useful to
   always be able to override bidi defaults, even sensible ones.  That
   page uses <span dir="..."> and <bdo dir="...">.  It could have used
   Unicode control chars if it was in plain text.  In that case, this
   clause doesn't apply and the chars are permitted to change the
   rendering -- why the inconsistensy?!?

   - Conversion of a text document to a higher-level format would (if
     done properly) turn Unicode control codes to higher-level codes.
     Why is this supposed

- Forcing path components to LTR order.  The fact that
   ``foo/BAR/BAZ/quux`` is rendered as ``foo/ZAB/ZAR/quux`` [1]_
   is an unfortunate consequence of the UBA.  I don't think that
   anyone should be prohibited from fixing the display (in a specific
   document, in a program for all IRIs it displays or in a future version
   of this standard...).

I don't think that disallowing overriding for IRIs (or anything) is 
useful.  I don't think that disallowing

.. [1] Things would be a lot clearer if the shash (^H^H SOLIDUS ;-))
    was a mirrored character.  It is true that the current tradition for
    Hebrew (that's what I know) does not reverse the slash but I think
    this tradition is broken ;-).  ``foo/ZAB\ZAR/quux`` would be much
    less ambiguos.  So would be division: ``OITAR ESION\LANGIS``.  The
    only missing part would be a way to mirror the minus sign ;-).

-- 
Beni Cherniavsky <cben at users.sf.net>
Note: I can only read email on week-ends...


More information about the bidi mailing list