[UTF-8] Aspell and UTF-8/Unicode
Kevin Atkinson
kevina@gnu.org
Sun, 15 Feb 2004 21:42:43 -0500 (EST)
On Mon, 16 Feb 2004, Elias Martenson wrote:
> m=C3=A5n 2004-02-16 klockan 02.20 skrev Kevin Atkinson:
> =
> > So does the curses library use LC_CTYPE to determine what encoding th=
e =
> > incoming string is?
> =
> Yes, that's what the empty string part of setlocale(LC_ALL,=22=22) mean=
s:
> =22read the locale settings from the environment variables=22.
Ok thanks. I knew what setlocale does I just wanted to make sure that =
curses was using it.
> The UTF-8 method is more standard because you can take your code, make
> sure you have the setlocale() call, make sure you do all the magic
> needed (like using wcslen() instead of strlen() and making sure you
> never grab individual char's from the strings) =
DO you know of any code samples for efficiency UTF-8 manipulation? I =
figure if I support 8-bit charater sets and UTF-8 that will be enough. =
This means I can detect when UTF-8 is being used and just handle the UTF-=
8 =
strings more carefully, more efficient than converting to to wchar_t just=
=
to get the length. What I really need are things like =
- length of utf-8 strings
- length of the current utf-8 character
-- =
http://kevin.atkinson.dhs.org