COMPOUND_TEXT versus UTF8_STRING
Jim.Gettys at hp.com
Wed Sep 22 17:23:20 EEST 2004
I'm widening this discussion to the include a larger audience than just
X hackers; I suspect the toolkit related folks will have better
On Wed, 2004-09-22 at 14:26 +0100, Markus Kuhn wrote:
> Sebastien wrote on 2004-09-22 09:07 UTC:
> > Where can I find a converter function or library which supports the
> > following string conversions:
> > - COMPOUND_TEXT to local encoding (defined by $LANG)
> > - local encoding (defined by $LANG) to COMPOUND_TEXT
> > - COMPOUND_TEXT to UTF8
> > - UTF8 to COMPOUND_TEXT
> Which reminds me to bring up the underlying more fundamental question,
> namely the future of COMPOUND_TEXT.
> COMPOUND_TEXT is an implementation of ISO 2022, a horrendously complex
> and impractical way of switching between multiple character sets within
> the same string, that clearly failed on the market place, and is no
> longer used today except for some CJK email. Mule Emacs used something
> similar for a while, but they are now moving to UTF-8 as the sole
> internal encoding for Emacs 23. All major web browsers have done the
> same long ago.
> COMPOUND_TEXT is in my opinion obsolete, and we should start thinking
> about a way to smoothly deprecate it from the standard, and make the way
> free for universally replacing it with the so much simpler and more
> practical UTF8_STRING. ISO 2022 is dead, and so should COMPOUND_TEXT be.
Seems like it would be good to start deprecating it.
> At present, UTF8_STRING is allocated in the X.Org registry, but none of the
> X Standards mention it yet. Some start was made a while ago in this direction
> in XFree86, most notably
> and it would be nice to see this taken up in the X11 standards.
> In particular, one question I am interested in is:
> Can we simply allow the use of UTF8_STRING in properties such as
> WM_NAME, WM_ICON_NAME and WM_CLIENT_MACHINE in a future version of the
> Or is it necessary to introduce separate properties along the lines of
> the _NET_WM_NAME, _NET_WM_ICON_NAME, etc. suggested in
> for reasonable backwards compatibility? What is the best practice here?
The issue would be interoperability. Are there likely things that
would break? If not, we should relax the spec; if so, we should
start adding new properties, I suspect.
Anyone have enough understanding to comment pretty definitively?
More information about the xdg