[Uim] [Docs] Terminology problems

Jeroen Ruigrok/asmodai asmodai at in-nomine.org
Mon Feb 5 16:51:03 EET 2007

-On [20070205 15:17], Martin Swift (martin at swift.is) wrote:
>1. "Conversion engine", "input module" and "input method"
>These (and possibly others as well) seem to be used interchangeably,
>though it may simply be a misunderstanding on my part. Specifically,
>I've seen different documments on uim-fep use each of these terms for
>'Anthy'. This is very confusing.
>My current best understanding from reading the available docs is that
>each of these refers to the ruleset used to do the conversion and, in
>the case of ambiguity (arising from homophones), supply candidates to
>libuim to display for the user to choose from.

An input method is typically a term used to describe the entirety of the
program(s) that do input for a specific language. So you can have a Korean
input method, a Japanese input method, et cetera.

A conversion engine is the inside of an input method, specifically the part
that takes your written text, ro-maji to kana to kanji + kana in Japanese
case, ro-maji to jamo to hangul in Korean's case, ro-maji to pinyin to hanzi
in Chinese case, and converts it in the probable characters that you wanted,
depending on context.

I think input module is a hook-in module for a specific input method.

>2. Input levels (no idea what your term for this is)
>It seems that uim has three "levels" of input.
> 1) Firstly, there is the thing mentioned above (direct/anthy/skk/...
> 2) Secondly there is the preedit text (in the case of Anthy, this
>seems to be romanji (latin), hiragana, katakana, half-width katakana,
>half-width romanji (though I'm not quite sure what the use of this
>is), and full width romanji).

It's essentially part of the 'workflow' for any input method. You have the raw
text, say:


Every time the input matches a kana it gets transliterated/replaced, so you
will get:


pressing space (or some other trigger key) will switch to the conversion mode,


> 3) Finally, there seems to be a choice between keyboard layouts.
>There are three under Anthy: ローマ字, かな and AZIK.

Those are read as: ro-maji, kana, and I am not sure what AZIK is.
In general people use the ro-maji one if they have, say, an US (international)
keyboard, the kana if they have a keyboard with keys (sub-)marked with
hiragana symbols, and AZIK I would not know.

Of course, this is all based upon my own experiences and reading into the
subject at hand.

