COMPOUND_TEXT versus UTF8_STRING
Markus.Kuhn at cl.cam.ac.uk
Wed Sep 22 06:26:56 PDT 2004
Sebastien wrote on 2004-09-22 09:07 UTC:
> Where can I find a converter function or library which supports the
> following string conversions:
> - COMPOUND_TEXT to local encoding (defined by $LANG)
> - local encoding (defined by $LANG) to COMPOUND_TEXT
> - COMPOUND_TEXT to UTF8
> - UTF8 to COMPOUND_TEXT
Which reminds me to bring up the underlying more fundamental question,
namely the future of COMPOUND_TEXT.
COMPOUND_TEXT is an implementation of ISO 2022, a horrendously complex
and impractical way of switching between multiple character sets within
the same string, that clearly failed on the market place, and is no
longer used today except for some CJK email. Mule Emacs used something
similar for a while, but they are now moving to UTF-8 as the sole
internal encoding for Emacs 23. All major web browsers have done the
same long ago.
COMPOUND_TEXT is in my opinion obsolete, and we should start thinking
about a way to smoothly deprecate it from the standard, and make the way
free for universally replacing it with the so much simpler and more
practical UTF8_STRING. ISO 2022 is dead, and so should COMPOUND_TEXT be.
At present, UTF8_STRING is allocated in the X.Org registry, but none of the
X Standards mention it yet. Some start was made a while ago in this direction
in XFree86, most notably
and it would be nice to see this taken up in the X11 standards.
In particular, one question I am interested in is:
Can we simply allow the use of UTF8_STRING in properties such as
WM_NAME, WM_ICON_NAME and WM_CLIENT_MACHINE in a future version of the
Or is it necessary to introduce separate properties along the lines of
the _NET_WM_NAME, _NET_WM_ICON_NAME, etc. suggested in
for reasonable backwards compatibility? What is the best practice here?
Markus Kuhn, Computer Lab, Univ of Cambridge, GB
http://www.cl.cam.ac.uk/~mgk25/ | __oo_O..O_oo__
More information about the xorg