[SCIM] Re: From a newbie: problems in running scim

Mike FABIAN mfabian at suse.de
Sat Jan 1 12:21:35 PST 2005


"Rodolfo Medina" <romeomedina at libero.it> さんは書きました:

> Mike Fabian wrote:
>
>>Probably you have used iso-8859-1 to encode the file names.
>>Convert all those file names to UTF-8 if you want to switch to an
>>UTF-8 locale. You can use convmv for that:
>>
>>    http://j3e.de/linux/convmv/
>
>
> Thanks, Mike:
>
> I downloaded and installed the package convmv-1.08.tar.gz,
> but don't know how to use the convmv command because I don't know the 'from'
> encoding:
> in fact, to convert the file name 'cittàdellascienza' to UTF-8 I try with:
>
> 	$ convmv -t UTF-8 cittàdellascienza
>
> , but it says: 'wrong/unknown "from" encoding!'.
> The file name isn't actually 'cittàdellascienza', because the 'à ' is not
> displayed;

Your file names are most likely ISO-8859-1 encoded because this was the
encoding used for Italian traditionally. That means

       $ convmv -f ISO-8859-1 -t UTF-8 *

should work (I used the wildcard '*' to match all files in the current
directory). convmv will not yet convert the encoding, it will just
display what it would do. If the result looks OK and you really
want to do the conversion, add the option "--notest".

       $ convmv -f ISO-8859-1 -t UTF-8 --notest *

> and I manage to write it with copy and paste.
> Another problem is that also inside files the 'à ', 'ì' etc. are not always
> displayed.

For converting the contents of files instead of file names you can use
"iconv" or "recode".

> Rodolfo:
>>> besides, some characters in man pages are messed up.
>
> Mike:
>>I don't know how that works in Mandrake. In SuSE Linux manpages work
>>fine in UTF-8, therefore I guess it will be OK in Mandrake as well.
>>Maybe you have inconsistent locale settings? I.e. maybe you have some
>>other locale variables set to non-UTF-8 locales? What is the output
>>of the command "locale"?
>
> That's it:
>
> [rodolfo at localhost rodolfo]$ locale
> LANG=POSIX
> LC_CTYPE=en_US.UTF-8
> LC_NUMERIC="POSIX"
> LC_TIME="POSIX"
> LC_COLLATE="POSIX"
> LC_MONETARY="POSIX"
> LC_MESSAGES="POSIX"
> LC_PAPER="POSIX"
> LC_NAME="POSIX"
> LC_ADDRESS="POSIX"
> LC_TELEPHONE="POSIX"
> LC_MEASUREMENT="POSIX"
> LC_IDENTIFICATION="POSIX"
> LC_ALL=
>
> , but I remember it wasn't like this at the beginning, when I installed
> Mandrake,
> it was rather with 'en_US' or maybe 'en_GB' everywhere;
> it became so after scim installation and configuration trials.
> Any other hint?

One should avoid mixing different encodings in LC_* variables.
Try to use something like

    export LANG=en_US.UTF-8
    export LC_PAPER=en_GB.UTF-8  # for DIN A4 paper

plus maybe

    export LC_MESSAGES=it_IT.UTF-8

if you want to see Italian messages. If you set only these variables,
"locale" should print:

    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES=it_IT.UTF-8
    LC_PAPER=en_GB.UTF-8
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=

See also:

    http://www.suse.de/~mfabian/suse-cjk/locales.html

-- 
Mike FABIAN   <mfabian at suse.de>   http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。


More information about the scim mailing list