[Uim] traditional Chinese py input?
Jon Babcock
jon at kanji.com
Tue Jan 13 00:29:29 EET 2004
[This msg is in UTF-8 encoded Unicode.]
yusuke at cherubim.icw.co.jp wrote:
>>Is there any documentation on the current py input method?
>
> Not yet, you may look at /usr/share/uim/PY.scm, but there is
> no document yet.
Thank you for answering my questions.
Please remember than I am not a programmer, and I apologize now for the
inevitable gaffes.
I think I may be able to help in creating an input method for all
Unicode-supported forms of Chinese characters if:
* the programming skills needed are minimal
* a compassionate programmer(s) will list the appropriate Lisp/Scheme
procedures and/or point me to specific, applicable sources
* extensive familiarity with kanji (in traditional and simplified (both
current Chinese and current Japanese) forms) is the main requirement
* documentation in English and/or translated from Japanese or Chinese
into English is needed
I'd appreciate any comments or guidance regarding the four points.
I looked at PY.scm and thought, "Great! If I add the traditional forms
of the kanji/hanzi to this list, I will have a crude input method for
both simplified and traditional Chinese."
So, in PY.scm I tried adding just one character, the traditional form of
kanji no kan (Chinese han4), at the beginning of line 111 like this:
((("h" "a" "n")) ("汉" "喊" "含" "寒" "汗" "韩" "憾" "涵" "函" "翰" "
撼" "罕" "旱" "捍" "酣" "悍" "憨" "晗" "瀚" "鼾" "顸" "阚" "焊" "蚶" "焓
" "颔" "菡" "撖" "邗" "邯"))
--> ((("h" "a" "n")) ("漢" "汉" "喊" "含" "寒" "汗" "韩" "憾" "涵" "函"
"翰" "撼" "罕" "旱" "捍" "酣" "悍" "憨" "晗" "瀚" "鼾" "顸" "阚" "焊" "
蚶" "焓" "颔" "菡" "撖" "邗" "邯"))
It didn't work. The traditional form of kan/han4 did not appear as the
first candidate when I tried using UIM-py (zh_CN) in, for example,
Bluefish 0.12. (I'll explain what happens in more detail later.)
I was not able to read the Chinese in PY.scm until I changed my default
encoding to GB-2312 in my editor. The first line of PY.scm says it is in
GB-18030, and my version of PY.el (for Emacs 21.3.1 on i386) says the
coding is iso-2022-7bit. I tried to use iconv to convert PY.scm from
GB-18030 to UTF-8, but GB-18030 is not supported by iconv.
I would like to encode PY.scm in UTF-8 because the goal is to have
access to all three (not counting other variants) versions of the
Chinese characters when I enter a pinyin spelling. E.g., if I enter
"han" my candidates would include 漢 for traditional Chinese [This
traditional character should have an additional horizontal stroke just
before the "grass" classifier on top --- I don't know whether it will
display here] and 漢, the current Japanese version, as well as the
simplified 辞, etc.
So my first question (of many, I'm afraid) is: Can I use a UTF-8 encoded
version of PY.scm?
And the second question is what must be changed besides the kanji lists
in PY.scm?
Thanks.
Jon
--
Jon Babcock <jonatkanji.com>
More information about the uim
mailing list