[PATCH libxkbcommon 1/4] compose: add xkbcommon-compose - API
David Herrmann
dh.herrmann at gmail.com
Sun Sep 14 23:41:37 PDT 2014
Hi
On Sun, Sep 14, 2014 at 11:05 PM, Ran Benita <ran234 at gmail.com> wrote:
> xkbcommon-compose is a Compose implementation for xkbcommon. It mostly
> behaves like libX11's Compose, but the support is somewhat low-level and
> is not transparent like in libX11. The user must add some supporting code
> in order to utilize it.
>
> The intended audience are users who use xkbcommon but not a full-blown
> input method. With this they can add Compose support in a straightforward
> manner, so they have a fairly complete keyboard input for Latin-like
> languages at least.
>
> See the header documentation for details.
>
> Signed-off-by: Ran Benita <ran234 at gmail.com>
> ---
> xkbcommon/xkbcommon-compose.h | 457 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 457 insertions(+)
> create mode 100644 xkbcommon/xkbcommon-compose.h
>
> diff --git a/xkbcommon/xkbcommon-compose.h b/xkbcommon/xkbcommon-compose.h
> new file mode 100644
> index 0000000..ed35250
> --- /dev/null
> +++ b/xkbcommon/xkbcommon-compose.h
> @@ -0,0 +1,457 @@
> +/*
> + * Copyright © 2013 Ran Benita
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef _XKBCOMMON_COMPOSE_H
> +#define _XKBCOMMON_COMPOSE_H
> +
> +#include <xkbcommon/xkbcommon.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file
> + * libxkbcommon Compose API - support for Compose and dead-keys.
> + */
> +
> +/**
> + * @defgroup compose Compose and dead-keys support
> + * Support for Compose and dead-keys.
> + * @since TBD
> + *
> + * @{
> + */
> +
> +/**
> + * @page compose-overview Overview
> + * @parblock
> + *
> + * Compose and dead-keys are a common feature of many keyboard input
> + * systems. They extend the range of the keysysm that can be produced
> + * directly from a keyboard by using a sequence of key strokes, instead
> + * of just one.
> + *
> + * Here are some example sequences, in the libX11 Compose file format:
> + *
> + * <dead_acute> <a> : "á" aacute # LATIN SMALL LETTER A WITH ACUTE
> + * <Multi_key> <A> <T> : "@" at # COMMERCIAL AT
> + *
> + * When the user presses a key which produces the \<dead_acute> keysym,
> + * nothing initially happens (thus the key is dubbed a "dead-key"). But
> + * when the user enters <a>, "á" is "composed", in place of "a". If
> + * instead the user had entered a keysym which does not follow
> + * \<dead_acute\> in any compose sequence, the sequence is said to be
> + * "cancelled".
> + *
> + * Compose files define many such sequences. For a description of the
> + * common file format for Compose files, see the Compose(5) man page.
> + *
> + * A successfuly-composed sequence has two results: a keysym and a UTF-8
> + * string. At least one of the two is defined for each sequence. If only
> + * a keysym is given, the keysym's string representation is used for the
> + * result string (using xkb_keysym_to_utf8()).
> + *
> + * This library provides low-level support for Compose file parsing and
> + * processing. Higher-level APIs (such as libX11's Xutf8LookupString(3))
> + * may be built upon it, or it can be used directly.
> + *
> + * @endparblock
> + */
> +
> +/**
> + * @page compose-conflicting Conflicting Sequences
> + * @parblock
> + *
> + * To avoid ambiguity, a sequence is not allowed to be a prefix of another.
> + * In such a case, the conflict is resolved thus:
> + *
> + * 1. A longer sequence overrides a shorter one.
> + * 2. An equal sequence overrides an existing one.
> + * 3. A shorter sequence does not override a longer one.
> + *
> + * Sequences of length 1 are allowed, although they are not common.
> + *
> + * @endparblock
> + */
> +
> +/**
> + * @page compose-cancellation Cancellation Behavior
> + * @parblock
> + *
> + * What should happen when a sequence is cancelled? For example, consider
> + * there are only the above sequences, and the input kesysms are
> + * \<dead_acute\> \<b\>. There are a few approaches:
> + *
> + * 1. Swallow the cancelling keysym; that is, no keysym is produced.
> + * This is the approach taken by libX11.
> + * 2. Let the cancelling keysym through; that is, \<b\> is produced.
> + * 3. Replay the entire sequence; that is, \<dead_acute\> \<b\> is produced.
> + * This is the approach taken by Microsoft Windows (approximately;
> + * instead of \<dead_acute\>, the underlying key is used. This is
> + * difficult to simulate with XKB keymaps).
> + *
> + * You can program whichever approach best fits users' expectations.
Hm, implementing 3) is a pain as we have to track the keysyms
separately. Your compose-API does not provide a way to retrieve the
parsed/failed sequence. But given that we have no dead-key =>
normal-key conversion right now, it's probably fine. If we want it, we
can add an API for both later on (assuming a trivial keysym conversion
from dead_key => normal is possible).
I also don't understand why dead-keys are related to keymaps? I mean,
yeah, the "nodeadkey" variant is part of the keymap, but the keymap
never defines behavior of dead-keys. We could simply provide a lookup
table that converts XKB_KEY_dead_xyz to XKB_KEY_xyz, right?
> + *
> + * @endparblock
> + */
> +
> +/**
> + * @struct xkb_compose_table
> + * Opaque Compose table object.
> + *
> + * The compose table holds the definitions of the Compose sequences, as
> + * gathered from Compose files. It is immutable.
> + */
> +struct xkb_compose_table;
> +
> +/**
> + * @struct xkb_compose_state
> + * Opaque Compose state object.
> + *
> + * The compose state maintains state for compose sequence matching, such
> + * as which possible sequences are being matched, and the position within
> + * these sequences. It acts as a simple state machine wherein keysyms are
> + * the input, and composed keysyms and strings are the output.
> + *
> + * The compose state is usually associated with a keyboard device.
> + */
> +struct xkb_compose_state;
> +
> +/** Flags affecting Compose file compilation. */
> +enum xkb_compose_compile_flags {
> + /** Do not apply any flags. */
> + XKB_COMPOSE_COMPILE_NO_FLAGS = 0
> +};
> +
> +/** The recognized Compose file formats. */
> +enum xkb_compose_format {
> + /** The classic libX11 Compose text format, described in Compose(5). */
> + XKB_COMPOSE_FORMAT_TEXT_V1 = 1
> +};
> +
> +/**
> + * @page compose-locale Compose Locale
> + * @parblock
> + *
> + * Compose files are locale dependent:
> + * - Compose files are written for a locale, and the locale is used when
> + * searching for the appropriate file to use.
> + * - Compose files may reference the locale internally, with directives
> + * such as %L.
> + * As such, functions like xkb_compose_table_new_from_locale() require
> + * a @p locale parameter. This will usually be the current locale (see
> + * locale(7) for more details). You may also want to allow the user to
> + * explicitly configure it, so he can use the Compose file of a given
> + * locale, but not use that locale for other things.
> + *
> + * You may query the current locale as follows:
> + * @code
> + * const char *locale;
> + * locale = setlocale(LC_CTYPE, NULL);
> + * @endcode
> + *
> + * This will only give useful results if the program had previously set
> + * the current locale using setlocale(3), with LC_CTYPE or LC_ALL and a
> + * non-NULL argument.
> + *
> + * If you prefer not to use the locale system of the C runtime library,
> + * you may nevertheless obtain the user's locale directly using
> + * environment variables, as described in locale(7). For example,
> + * @code
> + * locale = getenv("LC_ALL");
> + * if (!locale)
> + * locale = getenv("LC_CTYPE");
> + * if (!locale)
> + * locale = getenv("LANG");
> + * if (!locale)
> + * locale = "C";
> + * @endcode
> + *
> + * Note that some locales supported by the C standard library may not
> + * have a Compose file assigned.
> + *
> + * @endparblock
> + */
> +
> +/**
> + * Create a compose table for a given locale.
> + *
> + * The locale is used for searching the file-system for an appropriate
> + * Compose file. The search order is described in Compose(5). It is
> + * affected by the following environment variables:
> + * XCOMPOSEFILE, HOME, XLOCALEDIR.
> + *
> + * @param context
> + * The library context in which to create the compose table.
> + * @param locale
> + * The current locale. See @ref compose-locale.
> + * @param flags
> + * Optional flags for the compose table, or 0.
> + *
> + * @returns A compose table for the given locale, or NULL if the
> + * compilation failed or a Compose file was not found.
> + *
> + * @memberof xkb_compose_table
> + */
> +struct xkb_compose_table *
> +xkb_compose_table_new_from_locale(struct xkb_context *context,
> + const char *locale,
> + enum xkb_compose_compile_flags flags);
> +
> +/**
> + * Create a new compose table from a Compose file.
> + *
> + * @param context
> + * The library context in which to create the compose table.
> + * @param file
> + * The Compose file to compile.
> + * @param locale
> + * The current locale. See @ref compose-locale.
> + * @param format
> + * The text format of the Compose file to compile.
> + * @param flags
> + * Optional flags for the compose table, or 0.
> + *
> + * @returns A compose table compiled from the given file, or NULL if
> + * the compilation failed.
> + *
> + * @memberof xkb_compose_table
> + */
> +struct xkb_compose_table *
> +xkb_compose_table_new_from_file(struct xkb_context *context,
> + FILE *file,
> + const char *locale,
> + enum xkb_compose_format format,
> + enum xkb_compose_compile_flags flags);
> +
> +/**
> + * Create a new compose table from a memory buffer.
> + *
> + * This is just like xkb_compose_table_new_from_file(), but instead of
> + * a file, gets the table as one enormous string.
> + *
> + * @see xkb_compose_table_new_from_file()
> + * @memberof xkb_compose_table
> + */
> +struct xkb_compose_table *
> +xkb_compose_table_new_from_buffer(struct xkb_context *context,
> + const char *buffer, size_t length,
> + const char *locale,
> + enum xkb_compose_format format,
> + enum xkb_compose_compile_flags flags);
_from_buffer() right from the beginning, yey!
> +
> +/**
> + * Take a new reference on a compose table.
> + *
> + * @returns The passed in object.
> + *
> + * @memberof xkb_compose_table
> + */
> +struct xkb_compose_table *
> +xkb_compose_table_ref(struct xkb_compose_table *table);
> +
> +/**
> + * Release a reference on a compose table, and possibly free it.
> + *
> + * @param table The object. If it is NULL, this function does nothing.
> + *
> + * @memberof xkb_compose_table
> + */
> +void
> +xkb_compose_table_unref(struct xkb_compose_table *table);
> +
> +/** Flags for compose state creation. */
> +enum xkb_compose_state_flags {
> + /** Do not apply any flags. */
> + XKB_COMPOSE_STATE_NO_FLAGS = 0
> +};
> +
> +/**
> + * Create a new compose state object.
> + *
> + * @param table
> + * The compose table the state will use.
> + * @param flags
> + * Optional flags for the compose state, or 0.
> + *
> + * @returns A new compose state, or NULL on failure.
> + *
> + * @memberof xkb_compose_state
> + */
> +struct xkb_compose_state *
> +xkb_compose_state_new(struct xkb_compose_table *table,
> + enum xkb_compose_state_flags flags);
> +
> +/**
> + * Take a new reference on a compose state object.
> + *
> + * @returns The passed in object.
> + *
> + * @memberof xkb_compose_state
> + */
> +struct xkb_compose_state *
> +xkb_compose_state_ref(struct xkb_compose_state *state);
> +
> +/**
> + * Release a reference on a compose state object, and possibly free it.
> + *
> + * @param state The object. If NULL, do nothing.
> + *
> + * @memberof xkb_compose_state
> + */
> +void
> +xkb_compose_state_unref(struct xkb_compose_state *state);
> +
> +/**
> + * Get the compose table which a compose state object is using.
> + *
> + * @returns The compose table which was passed to xkb_compose_state_new()
> + * when creating this state object.
> + *
> + * This function does not take a new reference on the compose table; you
> + * must explicitly reference it yourself if you plan to use it beyond the
> + * lifetime of the state.
> + *
> + * @memberof xkb_compose_state
> + */
> +struct xkb_compose_table *
> +xkb_compose_state_get_compose_table(struct xkb_compose_state *state);
> +
> +/** Status of the Compose sequence state machine. */
> +enum xkb_compose_status {
> + /** The initial state; no sequence has started yet. */
> + XKB_COMPOSE_NOTHING,
> + /** In the middle of a sequence. */
> + XKB_COMPOSE_COMPOSING,
> + /** A complete sequence has been matched. */
> + XKB_COMPOSE_COMPOSED,
> + /** The last sequence was cancelled due to an invalid keysym. */
> + XKB_COMPOSE_CANCELLED
It is unclear what happens if a keysym is pressed that is _not_ part
of a compose sequence (that is, most keys). 'context' is 0 but no
matching compose node is found. I assume it generates
XKB_COMPOSE_NOTHING, but the comment here is unclear. Maybe the
_feed() or _get_state() description should mention how keys are
treated that are not part of compose sequences (and which are fed
while no compose sequence is active). I assume we do *not* return
XKB_COMPOSE_COMPOSED in those cases?
> +};
> +
> +/** The effect of a keysym fed to xkb_compose_state_feed(). */
> +enum xkb_compose_feed_result {
> + /** The keysym had no effect. */
> + XKB_COMPOSE_FEED_IGNORED,
> + /** The keysym started, advanced or cancelled a sequence. */
> + XKB_COMPOSE_FEED_ACCEPTED
> +};
> +
> +/**
> + * Feed one keysym to the Compose sequence state machine.
> + *
> + * This function advances into a compose sequence, cancels it, or has no
> + * effect (e.g. for modifier keysyms). The resulting status may be
> + * observed with xkb_compose_state_get_status().
> + *
> + * @param state
> + * The compose state object.
> + * @param keysym
> + * A keysym, usually obtained after a key-press event, with a
> + * function such as xkb_state_key_get_one_sym().
If a keypress generates multiple keysyms, are we supposed to call this
function in a loop? Or are we supposed to not feed such data into the
compose state? I remember we had this discussion before, but I'm not
sure what we agreed on. I think the conclusion was to always treat
multiple keysyms as an array of normal syms, not as an atomic
keypress, right?
> + *
> + * @returns Whether the keysym had any effect on the compose state. This
> + * is useful, for example, if you want to keep a record of the current
> + * sequence, but not for much else.
> + *
> + * @memberof xkb_compose_state
> + */
> +enum xkb_compose_feed_result
> +xkb_compose_state_feed(struct xkb_compose_state *state,
> + xkb_keysym_t keysym);
> +
> +/**
> + * Reset the Compose sequence state machine.
> + *
> + * The status is set to XKB_COMPOSE_NOTHING, and the current sequence
> + * is discarded.
> + *
> + * @memberof xkb_compose_state
> + */
> +void
> +xkb_compose_state_reset(struct xkb_compose_state *state);
> +
> +/**
> + * Get the current status of the compose state machine.
> + *
> + * @see xkb_compose_status
> + * @memberof xkb_compose_state
> + **/
> +enum xkb_compose_status
> +xkb_compose_state_get_status(struct xkb_compose_state *state);
> +
> +/**
> + * Get the result Unicode/UTF-8 string for a composed sequence.
> + *
> + * See @ref compose-overview for more details. This function is only
> + * useful when the status is XKB_COMPOSE_COMPOSED.
> + *
> + * @param[in] state
> + * The compose state.
> + * @param[out] buffer
> + * A buffer to write the string into.
> + * @param[in] size
> + * Size of the buffer.
> + *
> + * @warning If the buffer passed is too small, the string is truncated
> + * (though still NUL-terminated).
> + *
> + * @returns
> + * The number of bytes required for the string, excluding the NUL byte.
> + * If the sequence is not complete, or does not have a viable result
> + * string, returns 0, and sets @p buffer to the empty string (if
> + * possible).
> + * @returns
> + * You may check if truncation has occurred by comparing the return value
> + * with the size of @p buffer, similarly to the snprintf(3) function.
> + * You may safely pass NULL and 0 to @p buffer and @p size to find the
> + * required size (without the NUL-byte).
> + *
> + * @memberof xkb_compose_state
> + **/
> +int
> +xkb_compose_state_get_utf8(struct xkb_compose_state *state,
> + char *buffer, size_t size);
> +
> +/**
> + * Get the result keysym for a composed sequence.
> + *
> + * See @ref compose-overview for more details. This function is only
> + * useful when the status is XKB_COMPOSE_COMPOSED.
> + *
> + * @returns The result keysym. If the sequence is not complete, or does
> + * not specify a result keysym, returns XKB_KEY_NoSymbol.
> + *
> + * @memberof xkb_compose_state
> + **/
> +xkb_keysym_t
> +xkb_compose_state_get_one_sym(struct xkb_compose_state *state);
Why _one_sym() and not _get_syms()? Yeah, the current format only
allows one symbol, but I don't see why we restrict the API in such
ways. I mean, the UTF-8 fallback is kinda ugly right now and we might
be able to fix it in a V2 format if we allow multiple syms.
Thanks a lot for the work, Ran!
David
> +
> +/** @} */
> +
> +#ifdef __cplusplus
> +} /* extern "C" */
> +#endif
> +
> +#endif /* _XKBCOMMON_COMPOSE_H */
> --
> 2.1.0
>
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/wayland-devel
More information about the wayland-devel
mailing list