[UTF-8] Aspell and UTF-8/Unicode
Kevin Atkinson
kevin@atkinson.dhs.org
Sun, 15 Feb 2004 10:56:10 -0500 (EST)
On Sun, 15 Feb 2004, Elias Martenson wrote:
First off, Aspell can only spell check languages which have an phonetic
alphabet and words are visually easily to separate.
> > Can Japanese be spell checked in the traditional fashion, or at all?
>
> Yes it can. I don't speak Japanese myself, The traditional way would use
> the Hiragana and Katakana which are two phonetic alphabets. I am unsure
> about Katakana. I suppose even Katakana has many multi-letter words just
> like Chinese does.
>
> However, there are other interesting languages. As an example Ethiopian
> resides in unicode at U+1200 to U+137F. They do not fit inside one byte.
>
> Would performance really be such a problem with full unicode support? I
> realise some algorithms would hev to be redesigned, but wouldn't it be
> worth it to enjoy the greater flexibility?
Well if implemented carefully probably not. But it will be far from a
trivial task. And I just don't have the time. Most all languages that
can be spell checked fit inside an 8-bit character set.
If you care to educate me on the specifics of a language does not fir in
an 8-bit character set and how it can be spell checked I am all ears.
--
http://kevin.atkinson.dhs.org