[Uim] Korean input

David Oftedal david at start.no
Wed Jul 20 18:04:52 EEST 2005


>Now, I just tried it.  It is very interesting.
>
>There is a popular Korean word processor named HWP that even dominates
>over Microsoft Word in Korea, and HWP is regarded as a de facto
>standard word processor in the sense that a governmental office only
>accepts a document in HWP format.  HWP has a built-in romanized input
>method.  Since there is not such a thing as a standardized romanized
>Korean INPUT scheme, let me mention a couple of features that I find
>are good in HWP, which I wish were included in romaja.scm.
>  
>
Thank you, that's very kind :) On Linux there are two methods that I
know of; one is my romaja.scm, and the other is a table-based input
method for SCIM. So it seems that we have the lead.

>1. SPACE does not complete a syllable, but it is just inserted.  For
>   example, "b a b SPACE m eo g j a" translates to "밥 먹자."  A
>   Korean person would find that this was more natural.  However,
>   there are cases where the end of a syllable is ambiguous, and a key
>   is needed to signal the end or beginning of a syllable.  HWP does
>   not seem to have a key for this.  In my opinion, ' may be an
>   option.  Similarly, RETURN is entered when it is pressed.  In
>   general, if a key is not for a jamo, it is committed along with the
>   syllable that was being composed.
>  
>
Hmm. I've actually been unable to reproduce these problems on my system.
Both space and enter function as commit keys as long as there's anything
at all in the preedit. These issues could arise if you call romaja.scm
from loader.scm instead of hangul.scm, since all the rules for its
behavior, including the behavior of the space and enter keys, are
defined in hangul.scm.

>2. A few different latin representations are accepted for a single
>   jamo.  For example, "ㄹ" can be entered as "r" or "l", and "ㅙ" can
>   be entered as "wae", "uae", or "oae".  I think an input method
>   should be as generous as possible provided that it does not cause
>   further ambiguities.
>  
>
This is an interesting issue which I wish I'd given more consideration.
The old version actually contained some duplicated entries to allow the
user to enter either "r" or "l", but when I generated the new romaja.scm
table, I simply figured that "ㄹ" should become "r" before any vowel and
"l" otherwhise. In turn that means that "신라" will have to be entered
as "sinra", and can't be entered as "sinla" or "shinra" or "shinla". It
will require a substantial amount of extra entries to add these changes,
but I have the tools to make the necessary changes available, so it
should be relatively simple.

Unfortunately, though, I hardly speak a word of Korean, so I can only
add the equivalent forms that I know of... Which currently amounts to
"ua" and "oa" for "wa", "shi" for "si", and "r" for "l". I'll try to
read up on it before I make the changes, though, and I'm always open to
suggestions. Thanks for your input! :)

-David



More information about the uim mailing list