[Xcb] [PATCH:xwininfo 0/2] Handle UTF8 window names & EMWH hints
James Cloos
cloos at jhcloos.com
Wed Jun 30 13:09:54 PDT 2010
>>>>> "JC" == James Cloos <cloos at jhcloos.com> writes:
>>>>> "AC" == Alan Coopersmith <alan.coopersmith at oracle.com> writes:
AC> Unfortunately, I couldn't find any existing XCB API for COMPOUND_TEXT
AC> decoding, and reading the spec made my head spin and I had to go lay
AC> down before I puked.
I can understand that! ISO 2022 is, er, /interesting/.
AC> The libX11 code doesn't seem to be a simple function we can copy, but
AC> tied into the whole Xlib i18n module system, so may not help much.
AC> If we did ship with this regression, would it actually cause critical
AC> problems? Hopefully nothing is trying to parse xwininfo output to get
AC> names instead of just getting the properties themselves.
I looked at the Emacs src to prepare a patch to add support for the _NET
NAME props. It will be even easier than I expected.
Given that, the difficulty of supporting COMPOUNT_TEXT in xcb and
presuming that xwininfo(1) is only used by people and not parsed,
I have to agree that we can live with the regression.
The only (possible) issue left is that, xwininfo will sometimes fail to
print the WM_NAME at all.
In xterm, which sets WM_NAME(STRING) and WM_LOCALE_NAME(STRING), with
LOCALE set to en_US.UTF-8, I get this from a quick test:
:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && xprop -id 0x1800010|grep NAME;done
WM_LOCALE_NAME(STRING) = "en_US.UTF-8"
WM_ICON_NAME(STRING) = "á"
WM_NAME(STRING) = "á"
WM_LOCALE_NAME(STRING) = "en_US.UTF-8"
WM_ICON_NAME(STRING) = "È©"
WM_NAME(STRING) = "È©"
WM_LOCALE_NAME(STRING) = "en_US.UTF-8"
WM_ICON_NAME(STRING) = "é??"
WM_NAME(STRING) = "é??"
:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x1800010|grep ^x;done
xwininfo: Window id: 0x1800010 "á"
xwininfo: Window id: 0x1800010 "ȩ"
xwininfo: Window id: 0x1800010 "
In urxvt, which set the _NET props, I get:
:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x600022|grep ^x;done
xwininfo: Window id: 0x600022 "á"
xwininfo: Window id: 0x600022 "ȩ"
xwininfo: Window id: 0x600022 "金"
as expected.
It may be that xterm sends bogus utf8 in the third case; xwininfo may
just need to detect bad utf8. And I cannot tell from the code or the
commit log how xwininfo guesses in the first two cases that the STRING
is actually UTF-8. Is it just because xwininfo’s locale is .UTF-8?
In any case, perhaps it should do something better when the UTF-8 is
not valid?
That is handles the first two cases is a welcome progression, btw.
AC> If someone has a good way to solve this problem or wants to sign up
AC> to write an xcb-util library for COMPOUND_TEXT encoding/decoding, great,
AC> but I don't think I'll be solving it.
JC> On a related note, we should make xprop(1) report UTF8_STRING props
JC> using code similar to what you added here, falling back to the current
JC> output when conversion to the locale cannot work.
AC> Is this sort of thing common enough to make it worthwhile to have a
AC> xcb/util library for property encoding/decoding? Certainly Xlib
AC> handled this for you and hid it from applications, though it built
AC> the huge xlibi18n infrastructure around it.
I do think a simple set of routines would be usefule. I wanted to
suggest building on the <wchar.h> api, adding support for compund_text.
But the wchar api is itself a royal pain. I’m left suggesting just
xcb_utf8_to_compound_text() and xcb_compound_text_to_utf8(). The
compund text side would be just an (octet?) array; the utf8 side
should be a tuple of an octet array and a token representing the
preferred script (to deal with 2022 vs 10646 (dis-)unification).
But comments on that are extremely welcome!
-JimC
--
James Cloos <cloos at jhcloos.com> OpenPGP: 1024D/ED7DAEA6
More information about the Xcb
mailing list