COMPOUND_TEXT versus UTF8_STRING

Jim Gettys Jim.Gettys at hp.com
Wed Sep 22 07:23:20 PDT 2004


I'm widening this discussion to the include a larger audience than just
X hackers; I suspect the toolkit related folks will have better
knowledge here.

On Wed, 2004-09-22 at 14:26 +0100, Markus Kuhn wrote:
> Sebastien wrote on 2004-09-22 09:07 UTC:
> > Where can I find a converter function or library which supports the
> > following string conversions:
> > - COMPOUND_TEXT to local encoding (defined by $LANG)
> > - local encoding (defined by $LANG) to COMPOUND_TEXT
> > - COMPOUND_TEXT to UTF8
> > - UTF8 to COMPOUND_TEXT
> 
> Which reminds me to bring up the underlying more fundamental question,
> namely the future of COMPOUND_TEXT.
> 
> COMPOUND_TEXT is an implementation of ISO 2022, a horrendously complex
> and impractical way of switching between multiple character sets within
> the same string, that clearly failed on the market place, and is no
> longer used today except for some CJK email. Mule Emacs used something
> similar for a while, but they are now moving to UTF-8 as the sole
> internal encoding for Emacs 23. All major web browsers have done the
> same long ago.
> 
> COMPOUND_TEXT is in my opinion obsolete, and we should start thinking
> about a way to smoothly deprecate it from the standard, and make the way
> free for universally replacing it with the so much simpler and more
> practical UTF8_STRING. ISO 2022 is dead, and so should COMPOUND_TEXT be.

Seems like it would be good to start deprecating it.

> 
> At present, UTF8_STRING is allocated in the X.Org registry, but none of the
> X Standards mention it yet. Some start was made a while ago in this direction
> in XFree86, most notably
> 
>   http://www.pps.jussieu.fr/~jch/software/UTF8_STRING/
>   http://www.cl.cam.ac.uk/~mgk25/unicode.html#x11
> 
> and it would be nice to see this taken up in the X11 standards.
> 
> In particular, one question I am interested in is:
> 
> Can we simply allow the use of UTF8_STRING in properties such as
> WM_NAME, WM_ICON_NAME and WM_CLIENT_MACHINE in a future version of the
> ICCCM?
> 
> Or is it necessary to introduce separate properties along the lines of
> the _NET_WM_NAME, _NET_WM_ICON_NAME, etc. suggested in
> 
>   http://freedesktop.org/Standards/wm-spec/1.3/ar01s05.html
> 
> for reasonable backwards compatibility? What is the best practice here?
> 
> https://freedesktop.org/bugzilla/show_bug.cgi?id=271

> Markus
> 
The issue would be interoperability.  Are there likely things that
would break?  If not, we should relax the spec; if so, we should
start adding new properties, I suspect.

Anyone have enough understanding to comment pretty definitively?
				- Jim





More information about the xorg mailing list