[Clipart] character coding
ocalocal at btinternet.com
Wed Feb 9 05:15:38 PST 2005
> > Well, there appear to be two problems. Form-entered metadata does
> > need to be converted to UTF-8, but that won't fix the problem of
> > metadata in the uploaded file somehow being converted to Latin-1.
> Do we know that the latter is happening?
Hmm... Apparently it isn't happening any more (but I'm sure it was when
I tried it before, about a month ago).
I uploaded two test files today, and the non-ASCII characters (whether
encoded as UTF-8 or as numeric character references) were converted to
character entity references, except for a couple that were converted to
numeric character references. I think that this is wrong too: the SVG DTD
doesn't declare the character entity references so they can't be used.
Inkscape just strips them out.
> > I think you can probably cheat with the form-entered metadata by using
> > accept-charset="UTF-8 US-ASCII"
> > in the <form> tag, then you should only receive UTF-8. But very old
> > browsers may not know about accept-charset, and might send the data
> > in some other encoding.
> I will try this and see if the problem goes away. How "very old" does
> a browser have to be to ignore this? Are we talking the Netscape 3
> kind of very old, or are we talking IE5?
Well, accept-charset was defined in HTML 4.0 (April 1998), but apart from
that I have no idea. If you find that this doesn't work in many browsers
then you may have to look at the Content-Type field in the HTTP header to
see what encoding the data is in, and then convert it.
More information about the clipart