[ANNOUNCE] libxkbcommon-0.4.1

Thu Mar 27 15:41:25 PDT 2014

On Thu, Mar 27, 2014 at 09:04:07PM +0100, David Herrmann wrote:
> Hi

Hi David

> On Thu, Mar 27, 2014 at 8:22 PM, Ran Benita <ran234 at gmail.com> wrote:
> > - Added two new functions, xkb_state_key_get_utf{8,32}(). They
> >   combine the operations of xkb_state_key_get_syms() and
> >   xkb_keysym_to_utf{8,32}(), and provide a nicer interface for it
> >   (espcially for multiple-keysyms-per-level).
> 
> Slightly off-topic, but looking at the code, you ignore any multi-sym
> values (nsyms != 1). Users of that API might rely on that behavior, so
> have we at some point defined what exactly multi-sym means? Because if
> we add more and more APIs and if the user-base grows, we might at some
> point be unable to make use of multi-symbol behavior. I'm still not
> sure whether nsyms > 1 just means multiple sequential keysyms, or
> whether they should be handled as _one_ atomic combined keysym?

First, the reason I added these functions is that
https://bugs.freedesktop.org/show_bug.cgi?id=75892 made it necessary, I
would otherwise like to avoid adding new API at this point. But given
the situation, I tried to make the best of it.

I only ignore multiple-keysyms for the utf32 function -- I only added
it to match the existing xkb_keysym_to_utf32() function. I don't think
it makes sense to encode multiple-keysyms into a UTF-32 string, nobody
uses it. A better name would be xkb_keysym_to_unicode_codepoint() or
whatever, but I wanted to keep the existing name.

For the utf8 function, it is handled correctly, and is actually much
easier to use for multiple-keysyms than the old function. It builds the
string internally and finally ensures it's valid UTF-8.

Another advantage of adding new functions is that existing users don't
break due to the new behavior, they must opt-in.

Regarding intended use-case for multiple-keysyms, I consider it mainly
to be for sequences with combining characters - not everything has
precomposed codepoints, so if you want one of these, you don't have a
way to do it with single-keysym, but it still conceptually fits in a
keymap. However for the original intent you have to ask Daniel.

> > - The xkb_state_key_get_utf{8,32}() functions now apply Control
> >   transformation: when the Control modifier is active, the string
> >   is converted to an appropriate control character.
> >   This matches the behavior of libX11's XLookupString(3), and
> >   required by the XKB specification:
> >   http://www.x.org/releases/current/doc/kbproto/xkbproto.html#Interpreting_the_Control_Modifier
> >
> >   https://bugs.freedesktop.org/show_bug.cgi?id=75892
> 
> So xkb_state_key_get_utf8(state, code) !=
> xkb_keysym_to_utf8(xkb_state_key_get_one_sym(state, code))? (at least
> for ctrl+<char>)

Yes.

> Could we at least add a note/warning to the documentation? I cannot
> find anything here:
> http://xkbcommon.org/doc/current/group__state.html#ga0774b424063b45c88ec0354c77f9a247
> 
> For instance, for libtsm, I accept combined modifier+keysym+ucs4string
> input from the GUI layer and transform that internally. If the upper
> layer uses xkb_state_key_get_utf32() instead of xkb_keysym_to_utf32(),
> the logic will force the control-seq even though it might be mapped
> differently by the vte-layer (for instance, there's stuff like
> shifting values by 64 if ALT is pressed, and more legacy crap we need
> to support for eternity). Sure, most libtsm code uses the keysym, but
> for glyph transformation we have to use the ucs4 value. Imagine the
> ctrl+c input was shifted by the term-layer, you still end up with the
> \003 glyph, because the ucs4 string was mapped by xkbcommon.

Sure, the old functions are still useful for getting the "raw"
translation if you want it. And you're right about the docs - I'll add
some hopefully-not-too-confusing details instead of just "prefer the new
ones". Full details are in the bugs and commit msgs but of course I
don't expect anyone to read that :)

> I'm not saying the patch is wrong, in fact, the layout-search logic is
> actually what you wrote for kmscon and I appreciate that. I'm just
> saying that it's a really subtle API difference that can introduce
> weird bugs. Lets see how it works out, but if people start using
> xkb_state_key_get_utf32(), I might send a patch adding an xkb-state
> flag that disables this transformation. Or just force users to not use
> it (which would be unfortunate).

A flag makes sense; I try to to add stuff when someone really needs it,
not preemptively. Btw, xkb_state_new() doesn't take a flags arg - argh.

But in general this stuff should be the default, we initially tried to
avoid these nonsensical US-centric behavior (let's say intentionally :),
but both the keymaps and existing users actually depend on them, like
the capitalization and Control thing (and they are also there in the
spec).

> Btw., same is true for the implicit caps-lock magic in
> xkb_state_key_get_one_sym(). I'm now quite confused whether
> xkb_state_key_get_syms() users have to do caps-lock handling
> explicitly? Or is that done by keymaps?

Currently to get the implicit capitalization with get_syms(), you have
to do this:

    int nsyms;
    const xkb_keysym_t *syms;
    xkb_keysym_t sym;

    nsyms = xkb_state_key_get_syms(..., &syms);
    if (nsyms == 1) {
        sym = xkb_state_key_get_one_sym(...);
        syms = &sym;
    }

I imagine the disgust, but given the set of constraints we had I
couldn't think of any way to make this work. New API is still possible,
but then we'd have *three* ways to get keysyms...

As I may have mentioned, I wanted to change the *keymaps* so the
difference doesn't matter here. I still plan to send some patches for
easy cases, but fixing other cases would require major xkeyboard-config
surgery. So we're stuck with it.

> Anyhow, thanks for the release!

Thanks for the feedback!
Ran

> David