Questions and thoughts about input method protocol

Yichao Yu yyc1992 at
Wed Jan 30 16:54:35 PST 2013


On Wed, Jan 30, 2013 at 4:09 PM, Pekka Vuorela <pvuorela at> wrote:
> On 29.01.2013 04:49, Yichao Yu wrote:
>> Hi,
>> With the comment on the recent patches for the input method protocol,
>> it seems that we are finally on the way to support cursor following.
>> However I still have some questions about the protocol (and it's
>> limitation).
>> 1, Is there any plan to support xwayland?
>> I believe it is important to support using input method in x clients
>> running on xwayland. IMHO, the input method can still use xim or it's
>> own protocol to get key event from and send input result (as well as
>> preedit etc.) to the client running on xwayland with no problem.
>> However, I cannot see a perfect solution to make cursor following
>> works using the proposed way to locate the input overlay surface.
>> For application using xim, maybe it is possible to let xwayland handle
>> the events and then forward them to the text-model. However, first of
>> all, this cannot work for x client talking with input method using a
>> private protocol and it will be a HUUUUGE regression if we force all x
>> clients to use the broken xim (application frozen, wrong support of
>> cursor following and preedit etc.). Moreover, since xim support is so
>> different in different applications, the input method sometimes have
>> to do some hack on xim, which will not very likely be something we
>> want to add in xwayland.
>> (Maybe adding some interface to interact with xwayland to set cursor
>> position on certain x window surfaces?)
> X based input methods are using X windowing, so wouldn't they just continue
> working for X apps? (Provided that xwayland has the ability to put windows
> into absolute positions). Translation between wayland text <-> XIM doesn't
> seem worthwhile to me.

I c. So it should still be possible for a X window based input method
to work on xwayland. It's good to know that xwayland is able to put a
window (which means x window right?) at a absolute position (and will
also be able to know the size of screens?). However, in order to use
such functionality, I guess the input method (that support wayland)
would still have to draw the ui for x clients using X instead of
(native) wayland making it harder to provide a uniform look for all
clients. This applies not only to the input overlay window but also to
the on screen keyboard since the compositor may have zero knowledge of
how to keep the virtual keyboard away from the text cursor. Moreover,
it will require lots of additional effort if the ui is provided by a
different process or a completely different program [1][2].


>> 2, Is it possible for the input method to know anything about the client?
>> Some famous (Chinese) input methods (on Mac and Windows) support the
>> so-called context awareness, in another word, the input method will
>> use some information of the client to determine the candidate words
>> list (and its order). This may not be that useful for Latin based
>> languages (although it may also be good if you want to provide
>> spelling hints) but if your are facing a language with 3000-5000
>> frequently used characters and frequently used words 10-100 times of
>> this number depending on your context, this shouldn't be ridicules at
>> all.
>> Currently, although I don't know any input method on linux support
>> context awareness, it is possible to do this under X since all the
>> necessary information is accessible to the input method. These include
>> all general information the window system and a plugin running in the
>> client's process can know, including (most important and useful)
>> window titles (with other WM related properties like icon names,
>> application names etc.), pid's (plus host's for X) and window id's
>> etc. The pid's and window id's are also very useful for getting more
>> information from the underlying system (/proc for example) about the
>> client (e.g. command line arguments) and can also be used to provide
>> per program or per window input state for some programs (Fcitx support
>> both.)
>> Right now I don't think there is any way to get these information from
>> the input method protocol. It will be a big regression (not as big as
>> not supporting cursor following in x clients though) if this cannot be
>> supported in wayland.
>> NOTE: The "context type" added in the recent patches may also be
>> helpful on this but they are different. It is indeed helpful for input
>> method to know the user is typing in a url/search bar instead of a
>> normal text entry but the stuff you may want to search may be very
>> different on amazon and arxiv.
> Not wanting to ridicule chinese input needs, but fetching window title,
> executable name and/or application parameters to alter behavior is a _huge_
> hack. Going deeper into knowing what url is visible in a browser is even
> worse. Does not take much effort to break that kind of functionality.

Well, technically I don't really think that it is more hack than a
input method based on language model making use of surrounding text,
previous input results as well as results from the internet when
necessary. In fact, I fully agree that each good Chinese input method
is, by all mean, a __HUUUGE__ hack. It's just the nature of the
language and taking the window title into account is nothing more
special than any existing hacks. It may seem hacky if you think the
window title as a text string far away from the text cursor. However,
the window title is supposed to be and in most cases (including the
most important case: browsers) really is a short and perfect summary
of what's going on inside the window (as well as the text field where
the input is happening). As long as the program still set the window
title to a reasonable value, this functionality will not be broken.

And whether or not you think using window titles or any kinds of free
text context is a hack, providing per-program input state using pid of
the client shouldn't be considered a hack at all.

By mentioning amazon and arxiv, I don't really mean knowing what url
is visible (although it would also be helpful if it is indeed
available) but still determine the context from the title of the
window (in this case, the title of the webpage).

> I would try to find better ways to achieve targeted goals. One feature I've
> myself been thinking is virtual keyboard used with messaging app knowing
> what language, or type of language, the recipient is speaking and adjusting
> prediction/layout based on that. That could be done by having locale in text
> editor's state, but alternatively could also be editor setting a globally
> unique identifier (integer, string?) as state for text fields, after which
> text input side could learn what language got used. Similarly for chinese
> input such mechanism could be used to remember preferences. Out-of-the box
> amazon and arxiv wouldn't make difference, but after some use they would. To
> support this, the toolkits and applications would also need some
> enhancements, though.

First of all, I think there is already perfect solution to your
problem. Agree or not, keyboard layout (in this case) is just a set of
specialized and simplified input methods. As you can see, the virtual
keyboard in walyand is done in input methods (in fact, is there
another way in wayland for a program to send key events to another??),
fcitx supports keyboard layout as input methods around one year ago[3]
and is built-in in the release arround May. 2012 and ibus 1.5 also
ships with a broken keyboard layout setting functionality. Therefore,
the information you want to know is already known by the virtual
keyboard provider (or from the current keyboard layout setting) and
there is no need to query anything from the client.


One major difference between the language code and input context is
that language codes are finite and standardized, but input context is
not. A reasonable use case of the context (window title) is relate it
with some areas (e.g. shopping and science for the example I
mentioned), and load/use/set different weight on different
dictionaries accordingly. This is really just some dirty works that
needs to be frequently updated, impossible to standardized, and is
definitely not sth you would want to put into every programs/toolkits,
therefore, the client would generally have no idea (especially for a
web browser which the context varies a lot) how to process the context
(title, url etc.) except passing them directly to the input method.

In order to support such functionality, inspired by csslayer's
suggestion[4], I think it will be helpful to add sth like a
input_context::context_info(key, value) event which will notify the
input method about some information (and their changes) of the client.
Events with key="window_title", "pid", "program_name" can be
automatically sent by the compositor so clients that don't want to
provide additional information has nothing to worry about and those
that want to support input method can just send out additional
key-value pairs on the corresponding client side interface. If the
compositor ever have some more knowledge about the client that the
input method may be interested in, it can also be easily added here.


P.S. As for requiring non-out-of-box input method support, it is just
not practical in the Linux world. Most people out of the CJK[5]
community never ever care about input methods and most of them even
don't know what a input method is (Oh, yes sure, you guys are perfect
exceptions). If a feature that the developer don't care about (or even
worse, have no idea what that is) require additional effort to
implement, it will never be done. You can have a look at how XIM is
broken in many applications[6], how it haven't been improved over
almost 20 years[7], how bugs cannot get fixed or have any developers
ever looking at it for more than 10 years (REMAIN NEW!! WITH WORKING
PATCH!!)[8]. We really hope the input method protocol in wayland can
finally provide us a working input method, not having to worry about
application freezing, missing key events or other weird problems[9],
not having to maintain input method modules for 4 toolkits[10][11] and
without any regression or missing functionality.


Any ideas for my questions 3,4,5? =)

Yichao Yu

>> 6, Some random stuff of the current interface.
>> There seems to be a password context type. I think normally a password
>> field will not have input context or is that for using virtual
>> keyboard in password field?
> All text fields, including password ones, should generally be using input
> methods. Virtual keyboard is one case, but there could be also some other
> kind of composing even for hardware keyboard. Think of long press to get
> different characters etc.
>> There seems to be a empty text_model::set_preedit request for the
>> client. Shouldn't this be fully controlled by the input method?? (Plus
>> there isn't a corresponding event on the input method side.... anyway
>> it's just weird for me....)
> Was discussed last October:
> I suggested and Jan Arne agreed on removing it. Pending work, I suppose.
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at

More information about the wayland-devel mailing list