[UTF-8] Aspell and UTF-8/Unicode

Kevin Atkinson kevin@atkinson.dhs.org
Sun, 15 Feb 2004 10:56:10 -0500 (EST)


On Sun, 15 Feb 2004, Elias Martenson wrote:

First off, Aspell can only spell check languages which have an phonetic 
alphabet and words are visually easily to separate.

> > Can Japanese be spell checked in the traditional fashion, or at all?
> 
> Yes it can. I don't speak Japanese myself, The traditional way would use
> the Hiragana and Katakana which are two phonetic alphabets. I am unsure
> about Katakana. I suppose even Katakana has many multi-letter words just
> like Chinese does.
> 
> However, there are other interesting languages. As an example Ethiopian
> resides in unicode at U+1200 to U+137F. They do not fit inside one byte.
> 
> Would performance really be such a problem with full unicode support? I
> realise some algorithms would hev to be redesigned, but wouldn't it be
> worth it to enjoy the greater flexibility?

Well if implemented carefully probably not.  But it will be far from a 
trivial task.  And I just don't have the time.  Most all languages that 
can be spell checked fit inside an 8-bit character set.

If you care to educate me on the specifics of a language does not fir in 
an 8-bit character set and how it can be spell checked I am all ears.


-- 
http://kevin.atkinson.dhs.org