[SCIM] Unsupported languages on Linux and SCIM
David
david at plm11.pl
Mon Sep 6 04:01:51 PDT 2004
*This message was transferred with a trial version of CommuniGate(tm) Pro*
>I'm one of developpers of of m17n-lib. At least vi-viqr
>input method is working well with an example program "medit"
>(a simple Unicode editor included in the distribution of
>m17n-lib).
>
>
>
The problem turnd out yo be elswhere :-)
>It is fairly easy to add an input method for the m17n
>database (m17n-db) which is used by scim-m17n.
>
Thank you for your explanations - they are most useful to me. I would
like to ask you one more question, which is still not quite clear to me.
It is not directly SCIM or IME related, but affects the way some IME
problems can be solved.
Is Linux software capable of displaying directly Basic Multilingual
Plane (plane 0) unicode characters? I did a test in console by running
$ perl -e 'print "\x{1200B}\n"'
and I got one square (suggesting one character to me). But if I redirect
the character to a file and open it with KDE editors or OpenOffice, they
display two squares. I understand that those applications did not read
the unicode string correctly and split the code into two characters. For
me it means, that those applications will not support characters encoded
in BMP.
Please note that I may be totally wrong, as I do not know much about
Linux yet. I am just looking for a way to answer my question and this is
the way I tried. I just did not find a webpage on Internet explaining
these things directly.
A few words that should shed some light on what I want to achieve.
As a subject for an excercise in SCIM programming I chose an
implementation of cuneiform input method. It is perfect because IMO it
is somwhere in the middle of the difficulty scale from the technical
point of view (I think Egyptian hieroglyphics is the most challenging
:-) ). It is also fun and may become useful for scholars.
To achieve the goal I have first to add Linux support for the script - I
need to add language code stuff etc. How do I do that?
Next I need to create a font. Unicode chose BMP for cuneiform. Can
displaying BMP characters be achieved in Linux applications? How?
If it is not possible, I will shift the encoding to private use area,
which I guess is supported (it is on Windows :-) )
Than I need SCIM input. The input of cuneiform will be similar to
chinese pinyin: user will choose the language (akkadian, hittite,
sumerian) and input pronunciation. Then he will choose the right
candidate from the list. If there is only one candidate, it will be
input to the application (the way japanese kana input works). Ideally
there should be some switches in the input method for a given language
(akkadian, for example) to set the "time" (I lack correct term) for the
text - in different "times" the pronunciations of cuneiform characters
changed, so setting it correctly can dramatically speed up input (at
least this is someone I developed such solution for in MS Word macros
told me).
In addition to cuneiform input I would like to add an inpot for the
transcription of Middle East languages. It could be done by simple
remapping the keyboard, but I do not want the user to learn strange
keyboard layouts. I think VIQR approach is ideal for this purpose. Users
will not input huge amount of text in transcription so the inefficiency
of the method is not a problem. I think it is ideal for this purpose.
This is why I want to understand how VIQR works.
Yestarday evening I read the SCIM header files and the introduction to
the SCIM architechture provided by James in "design.zh_CN" document. I
have some general picture of how to achieve my goals in SCIM (of course
I will for sure have problems with compiling and testing the code - I
did not program in C++ so far. And the makefile framework used in SCIM
is absolutely new to me, though it looks standard for Linux folks :-( ).
However, I have no idea yet how to correctly add locale for the language
"cuneiform akkadian" in Linux :-), whether I will have correct tools to
prepare the fonts etc.
Your comments and criticism of my approach would be most interesting and
helpful for me.
Best regards,
David
More information about the scim
mailing list