[Xcb] [PATCH:xwininfo 0/2] Handle UTF8 window names & EMWH hints

James Cloos cloos at jhcloos.com
Wed Jun 30 13:09:54 PDT 2010

>>>>> "JC" == James Cloos <cloos at jhcloos.com> writes:
>>>>> "AC" == Alan Coopersmith <alan.coopersmith at oracle.com> writes:

AC> Unfortunately, I couldn't find any existing XCB API for COMPOUND_TEXT
AC> decoding, and reading the spec made my head spin and I had to go lay
AC> down before I puked.

I can understand that!  ISO 2022 is, er, /interesting/.

AC> The libX11 code doesn't seem to be a simple function we can copy, but
AC> tied into the whole Xlib i18n module system, so may not help much.

AC> If we did ship with this regression, would it actually cause critical
AC> problems?   Hopefully nothing is trying to parse xwininfo output to get
AC> names instead of just getting the properties themselves.

I looked at the Emacs src to prepare a patch to add support for the _NET
NAME props.  It will be even easier than I expected.

Given that, the difficulty of supporting COMPOUNT_TEXT in xcb and
presuming that xwininfo(1) is only used by people and not parsed,
I have to agree that we can live with the regression.

The only (possible) issue left is that, xwininfo will sometimes fail to
print the WM_NAME at all.

In xterm, which sets WM_NAME(STRING) and WM_LOCALE_NAME(STRING), with
LOCALE set to en_US.UTF-8, I get this from a quick test:

:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && xprop -id 0x1800010|grep NAME;done

:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x1800010|grep ^x;done
xwininfo: Window id: 0x1800010 "á"
xwininfo: Window id: 0x1800010 "ȩ"
xwininfo: Window id: 0x1800010 "

In urxvt, which set the _NET props, I get:

:; for ij in á ȩ 金;do printf "\033]0;${ij}\a" && ./xwininfo -id 0x600022|grep ^x;done
xwininfo: Window id: 0x600022 "á"
xwininfo: Window id: 0x600022 "ȩ"
xwininfo: Window id: 0x600022 "金"

as expected.

It may be that xterm sends bogus utf8 in the third case; xwininfo may
just need to detect bad utf8.  And I cannot tell from the code or the
commit log how xwininfo guesses in the first two cases that the STRING
is actually UTF-8.  Is it just because xwininfo’s locale is .UTF-8?

In any case, perhaps it should do something better when the UTF-8 is
not valid?

That is handles the first two cases is a welcome progression, btw.

AC> If someone has a good way to solve this problem or wants to sign up
AC> to write an xcb-util library for COMPOUND_TEXT encoding/decoding, great,
AC> but I don't think I'll be solving it.

JC> On a related note, we should make xprop(1) report UTF8_STRING props
JC> using code similar to what you added here, falling back to the current
JC> output when conversion to the locale cannot work.

AC> Is this sort of thing common enough to make it worthwhile to have a
AC> xcb/util library for property encoding/decoding?   Certainly Xlib
AC> handled this for you and hid it from applications, though it built
AC> the huge xlibi18n infrastructure around it.

I do think a simple set of routines would be usefule.  I wanted to
suggest building on the <wchar.h> api, adding support for compund_text.

But the wchar api is itself a royal pain.  I’m left suggesting just
xcb_utf8_to_compound_text() and xcb_compound_text_to_utf8().  The
compund text side would be just an (octet?) array; the utf8 side
should be a tuple of an octet array and a token representing the
preferred script (to deal with 2022 vs 10646 (dis-)unification).

But comments on that are extremely welcome!

James Cloos <cloos at jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

More information about the Xcb mailing list