[UTF-8] Aspell and UTF-8/Unicode

Kevin Atkinson kevina@gnu.org
Tue, 17 Feb 2004 00:03:01 -0500 (EST)


All, 

The CVS version of Aspell can now check documents in UTF-8.  the encoding 
is not set based on the current locale.

I hope to have other parts of Aspell accepting UTF-8 by the time I release 
the next snapshot (ie within a week or so).  I will post again when this 
is done.

Unless you know how Aspell works please don't tell me that Aspell will 
better off if it supported UTF-8 internally.  As it may very well not be, 
ie, it may be slower, use more memory, etc.

Also, keep in mind that with Aspell 0.51 if your language has fewer than 
256 distinct characters (including Upper/Lower case and accents) than 
Aspell will be able to support it.  

I am considering a "dual-script" mode where Aspell can use a separate
dictionary depending on which script it detects the current word in, the
two dictionaries can have nothing in common, ie an English one and a
Russian one for example.  This will NOT not support two languages that use
the same script as that is a lot more complicated.  For example if the
word is misspelled which dictionary should it use for the suggestions?

--- 
http://kevin.atkinson.dhs.org