[Uim] uim-py: Adding idioms to PY.scm
Yukiko Bando
ybando at k6.dion.ne.jp
Tue Apr 6 16:40:52 EEST 2004
On Sun, 04 Apr 2004 11:07:33 -0600, Jon Babcock <jon at kanji.com> wrote:
> You may want to add tsi.src which contains about 130,000 entries.The
> first step is to convert the Zhuyin entries to Pinyin and then to
> eliminate duplicates with your CEDICT file. I plan to do this
> eventually, but would be *delighted* if I didn't have to. <g> And then
> the file must be converted to UTF-8 Unicode, I assume.
Traditional Chinese characters look more familiar to me than simplified
ones, but it seems impossible to convert Zhuyin to Pinyin properly without
good knowledge about the language. I just started learning the standard
Chinese and would like to keep my PY.scm pure Mandarin to avoid confusion.
;-) I would rather improve it by rearranging words that appear in
candidate lists as well as adding missing words one by one. Sorry if I
have disappointed you... But if you send me a list of pinyin and
corresponding words in a spreadsheet, I think I can convert it to
TSI_PY.scm using my tool, which is probably the easiest part of your
project though.
> Nevertheless, a professional C-E translator encounters many words that
> are not included in any of the above 7 or 8 major dictionaries of
> Chinese words. (Just check the daily traffic on the fanyi mailing list
> for an endless stream of examples.)
The same is true for Japanese. I often encounter unfamiliar words
especially on the internet. For good or bad, people continue to coin new
words.
Yukiko
More information about the uim
mailing list