[Uim] traditional Chinese py input?

Jon Babcock jon at kanji.com
Tue Jan 13 00:29:29 EET 2004


[This msg is in UTF-8 encoded Unicode.]

yusuke at cherubim.icw.co.jp wrote:

>>Is there any documentation on the current py input method?
> 
> Not yet, you may look at /usr/share/uim/PY.scm, but there is 
> no document yet.

Thank you for answering my questions.

Please remember than I am not a programmer, and I apologize now for the 
inevitable gaffes.

I think I may be able to help in creating an input method for all 
Unicode-supported forms of Chinese characters if:

* the programming skills needed are minimal

* a compassionate programmer(s) will list the appropriate Lisp/Scheme 
procedures and/or point me to specific, applicable sources

* extensive familiarity with kanji (in traditional and simplified (both 
current Chinese and current Japanese) forms) is the main requirement

* documentation in English and/or translated from Japanese or Chinese 
into English is needed

I'd appreciate any comments or guidance regarding the four points.


I looked at PY.scm and thought, "Great! If I add the traditional forms 
of the kanji/hanzi to this list, I will have a crude input method for 
both simplified and traditional Chinese."

So, in PY.scm I tried adding just one character, the traditional form of 
kanji no kan (Chinese han4), at the beginning of line 111 like this:

  ((("h" "a" "n")) ("汉" "喊" "含" "寒" "汗" "韩" "憾" "涵" "函" "翰" " 
撼" "罕" "旱" "捍" "酣" "悍" "憨" "晗" "瀚" "鼾" "顸" "阚" "焊" "蚶" "焓 
" "颔" "菡" "撖" "邗" "邯"))

--> ((("h" "a" "n")) ("漢" "汉" "喊" "含" "寒" "汗" "韩" "憾" "涵" "函" 
"翰" "撼" "罕" "旱" "捍" "酣" "悍" "憨" "晗" "瀚" "鼾" "顸" "阚" "焊" " 
蚶" "焓" "颔" "菡" "撖" "邗" "邯"))

It didn't work. The traditional form of kan/han4 did not appear as the 
first candidate when I tried using UIM-py (zh_CN) in, for example, 
Bluefish 0.12. (I'll explain what happens in more detail later.)

I was not able to read the Chinese in PY.scm until I changed my default 
encoding to GB-2312 in my editor. The first line of PY.scm says it is in 
GB-18030, and my version of PY.el (for Emacs 21.3.1 on i386) says the 
coding is iso-2022-7bit. I tried to use iconv to convert PY.scm from 
GB-18030 to UTF-8, but GB-18030 is not supported by iconv.

I would like to encode PY.scm in UTF-8 because the goal is to have 
access to all three (not counting other variants) versions of the 
Chinese characters when I enter a pinyin spelling. E.g., if I enter 
"han" my candidates would include 漢 for traditional Chinese [This 
traditional character should have an additional horizontal stroke just 
before the "grass" classifier on top --- I don't know whether it will 
display here] and 漢, the current Japanese version, as well as the 
simplified 辞, etc.

So my first question (of many, I'm afraid) is: Can I use a UTF-8 encoded 
version of PY.scm?

And the second question is what must be changed besides the kanji lists 
in PY.scm?

Thanks.


Jon

-- 
Jon Babcock <jonatkanji.com>




More information about the uim mailing list