[jon at joncruz.org: Re: [Inkscape-devel] [jonadab at bright.net: Re: [Clipart] character coding]]
bryce at bryceharrington.com
Sun Feb 6 22:41:33 PST 2005
----- Forwarded message from "Jon A. Cruz" <jon at joncruz.org> -----
Date: Sun, 06 Feb 2005 22:20:39 -0800
From: "Jon A. Cruz" <jon at joncruz.org>
To: Bryce Harrington <bryce at bryceharrington.com>
Subject: Re: [Inkscape-devel] [jonadab at bright.net: Re: [Clipart] character
Bryce Harrington wrote:
>Jon, can you give some advice on this one?
Well... ISO-8859-1 is the "Latin 9" that's becoming more common,
especially in Europe.
Now, given the nature of UTF-8, it's fairly good at being detectable.
That is, with the lead and trail byte combos, it's fairly easy to walk a
file and determine if it's UTF-8, as a file of any significant size
probably won't give false positives.
Of course, it's easy for content to lie about its encoding. Even though
HTML or XML has the string "UTF-8" in i, anything could have changed it
or edited it in the wrong encoding.
Further questions would be what script, what tool, what servers and what
protocols are involved.
----- End forwarded message -----
More information about the clipart