[Uim] Korean input

David Oftedal david at start.no
Sun Jul 24 21:49:14 EEST 2005


Park Jae-hyeon wrote:

> My notation was inconsistent. I should have said that HWP translated
>
>"b a b SPACE m eo g j a" to "밥 SPACE 먹 자".  romaja.scm translates
>"b a b SPACE m eo g j a" to "밥 먹 자".  Here, ignore all blanks
>inside double quotes.  romaja.scm regards the SPACE as the commit key,
>and it commits "밥", but it does not commit SPACE itself.  Therefore,
>using romaja.scm, one should press SPACE twice after "b a b" to insert
>a space after "밥", while one presses SPACE only once with HWP.  This
>means that SPACE, RETURN, ESC, or any key that cannot be interpreted
>as a Korean jamo, commits itself as well as the preedit string.  HWP
>works in this way because it is the usual way a Korean keyboard (such
>as 2-beolsik) works.  Although HWP's behavior appears natural to
>Koreans, romaja.scm's behavior may look more natural to a foreigner
>who, for example, is familiar with a Japanese input method.  IMHO, an
>explicit commit key for Korean input is redundant.  The reason for
>having an explicit commit key for a Japanese input method, I think, is
>that it has to perform kana-kanji conversion, which is rarely needed
>for Korean.
>  
>
However, that isn't true for this input method, since it doesn't
distinguish between initials and finals. For instance, the sequence
"lalala" can become either "라라라" or "랄알아" depending on how the
user uses the commit key. Or what if the user wants to type "감ㅤㅅㅏㅎ압니다"
instead of "감사합니다"? It doesn't have to be space, of course, but it
has to be a key that the user can press fairly often without undue
stress being put on the fingers.

> FYI, let me enumerate the latin letter assignments used in HWP.
>
>ㄱ  g
>ㄲ  gg, kk, qq, c (not cc)
>ㄴ  n
>ㄷ  d
>ㄸ  dd, tt
>ㄹ  r, l
>ㅁ  m
>ㅂ  b, v
>ㅃ  bb, pp, ff, vv
>ㅅ  s
>ㅆ  ss
>ㅇ  (optional) x
>ㅈ  j, z
>ㅉ  jj, zz
>ㅊ  ch
>ㅋ  k, q
>ㅌ  t
>ㅍ  p, f
>ㅎ  h
>
>ㅏ  a
>ㅐ  ae
>ㅑ  ya, ia
>ㅒ  yae, iae
>ㅓ  eo
>ㅔ  e
>ㅕ  yeo, ieo
>ㅖ  ye, ie
>ㅗ  o
>ㅘ  wa, ua, oa
>ㅙ  wae, uae, oae
>ㅚ  woe, uoe, oi
>ㅛ  yo, io
>ㅜ  u, w, oo
>ㅝ  wo, uo
>ㅞ  we, ue
>ㅟ  wi
>ㅠ  yu, iu
>ㅡ  eu
>ㅢ  ui, eui
>ㅣ  i, y, ee
>  
>
Wow. It might be difficult to include all of these, since each
additional one requires around 884 extra lines in the table file. I've
made a script to fix the r/l issue, though, which will be corrected in
the next version, and I'll try to add some of the other ones. I'm a bit
puzzled by ㅢ and ㅟ though - Wouldn't it be more correct to map "ui" to
ㅟ and map the other to "eui"?

-David



More information about the uim mailing list