[UTF-8] Aspell and UTF-8/Unicode

Elias Martenson elias-m@algonet.se
Tue, 17 Feb 2004 14:47:13 +0100


It's great to hear that the Unicode support in aspell has been improved.

I have a suggestion for you, (and I need the others on this mailing list
to comment as well). Shouldn't aspell perform decomposition according to
decomposition form D? Otherwise aspell will fail to spell check certain
words. Consider, for example the swdish word "m=C3=A5l". The "=C3=A5" can be
represented by either:

    U+00E5 LATIN SMALL LETTER A WITH RING ABOVE

or

    U+0061 LATIN SMALL LETTER A + U+030A COMBINING RING ABOVE

By performing decomposition first, aspell will be guaranteed to always
receive the second form, and there won't be any problem.

Regards

Elias M=C3=A5rtenson