[UTF-8] Aspell and UTF-8/Unicode

Noah Levitt nlevitt@columbia.edu
Mon, 16 Feb 2004 12:58:31 -0500


On Sun, Feb 15, 2004 at 21:42:43 -0500, Kevin Atkinson wrote:
> 
> DO you know of any code samples for efficiency UTF-8 manipulation?  

There are many useful functions in glib.
http://developer.gnome.org/doc/API/2.0/glib/glib-Unicode-Manipulation.html

Presumably you don’t want to depend on glib; you can copy
and paste the source of the functions you need (and rename
them), since glib is LGPL.

>   - length of utf-8 strings

g_utf8_strlen() returns the number of characters in the
string. Use plain old strlen() for the number of bytes.

>   - length of the current utf-8 character

g_utf8_find_next_char() does an equivalent operation.
g_utf8_next_char() is also useful.

Noah