[Uim] [Docs] Terminology problems

Martin Swift martin at swift.is
Mon Feb 5 18:06:34 EET 2007


Hi Jerogen,

Thanks for your swift reply,

On Mon, Feb 05, 2007 at 03:51:03PM +0100, Jeroen Ruigrok/asmodai wrote:
> -On [20070205 15:17], Martin Swift (martin at swift.is) wrote:
> >1. "Conversion engine", "input module" and "input method"
> 
> An input method is typically a term used to describe the entirety of the
> program(s) that do input for a specific language. So you can have a Korean
> input method, a Japanese input method, et cetera.
> 
> A conversion engine is the inside of an input method, specifically the part
> that takes your written text, ro-maji to kana to kanji + kana in Japanese
> case, ro-maji to jamo to hangul in Korean's case, ro-maji to pinyin to hanzi
> in Chinese case, and converts it in the probable characters that you wanted,
> depending on context.
> 
> I think input module is a hook-in module for a specific input method.

>From what I've read, it seems that Anthy, Canna and PRIME are all
input methods in their own right, but are used as conversion engines
or input modules by uim. Is this correct? Does uim use modified
versions of these IMs or just use their internals? Which would be the
correct term for whatever it is that uim uses?

There is also the term "dictionary" that seems to float around. Is the
dictionary the "input module"/"conversion engine" itself, just a part
of it, or does this vary? Are dictionaries standardized or does every
"input module"/"conversion engine" implement one differently?

> >2. Input levels (no idea what your term for this is)
> >
> >It seems that uim has three "levels" of input.
> > 1) Firstly, there is the thing mentioned above (direct/anthy/skk/...
> >etc.).
> > 2) Secondly there is the preedit text (in the case of Anthy, this
> >seems to be romanji (latin), hiragana, katakana, half-width katakana,
> >half-width romanji (though I'm not quite sure what the use of this
> >is), and full width romanji).
> 
> It's essentially part of the 'workflow' for any input method. You have the raw
> text, say:

I think I may have been a little too unclear on this. It's not the
workflow that I'm asking about, but the different types of input.

Launch one of the toolbars and in it (at least with Anthy -- I guess I
need to install more for testing purposes) you will find three types
of options:

 1) One chooses the "input module"/"conversion engine"/"dictionary".
 2) Another chooses the characters displayed in the preedit.
 3) The thrid chooses the keyboard layout. ローマ字 and AZIK seem to be
whatever X sets up (can't find out what the difference is, AZIK seems
to be a qwerty compatible Japanese extension) while かな is the
たていすかん layout.
 
This, of course is only taken from Anthy so these features may not be
shared by other "input modules"/"conversion engines"/"dictionaries".
I'm interested in learning which of these are general uim features and
which depend on the ".."/".."/".." (sorry, typing this gets a little
boring). 

I'm hoping to tackle the usage section of the docs soon and knowing
what to explain where would be a great asset.

Again, thanks for the input so far,
Martin

-- 
\u270C



More information about the uim mailing list