[UTF-8] Aspell and UTF-8/Unicode

Noah Levitt nlevitt@columbia.edu
Tue, 17 Feb 2004 14:07:56 -0500


On Tue, Feb 17, 2004 at 14:47:13 +0100, Elias Martenson wrote:

> I have a suggestion for you, (and I need the others on this mailing list
> to comment as well). Shouldn't aspell perform decomposition according to
> decomposition form D? 

Yes, you are right, to be technically correct aspell should
do normalization [1]. Either NFC (fully composed) or NFD
(fully decomposed) would work, but you are right to suggest
NFD because it would almost certainly be more efficient; the
first step in a (non-optimized) algorithm for NFC is to
perform NFD on the input.

Noah

[1] http://www.unicode.org/reports/tr15/