[PATCH 01/10] terminal: UTF-8 support

Bill Spitzak spitzak at gmail.com
Sun Jan 9 21:14:23 PST 2011


Please make sure that errors result in only the first byte of an error 
being replaced and UTF-8 parsing continuing with the second byte.

For instance if 0xE0 followed by 0x20 should produce an error indicator 
followed by a space. It appears this code will only produce an error 
indicator.

In addition this does not appear to be detecting and rejecting overlong 
forms.

For terminal display I have found it very useful to display the error 
bytes as though they are ISO8859-1 or CP1252 bytes. This makes the 
result readable if ISO8859-1 is accidentally output to the terminal.

Sorry to be a pain about this but bad UTF-8 handling is one of the 
things that really annoys me.


More information about the wayland-devel mailing list