[Clipart] character coding

Jonadab the Unsightly One jonadab at bright.net
Tue Feb 8 05:33:19 PST 2005


"Stephen Silver" <ocalocal at btinternet.com> writes:

> Where is the upload script receiving the RDF data from?  My
> impression of the way things are supposed to work was that the
> script first writes the file (unchanged) to disk,

Yes (except, for now it's still using CGI::Lite to do this step.  I
intend to change that eventually, as it has proved to be a problem,
but it's what we still have for now.

> then attempts to read it using SVG::Metadata, which in turn uses
> XML::Twig, 

Yes, it does that.

> which ought to detect the character encoding (probably using the
> procedure outlined in Appendix F of the XML spec) and return
> everything in UTF-8.  So your script shouldn't need to worry about
> about the encoding, as it should only ever see UTF-8, and everything
> should work.

However, the script also receives some metadata from the web form,
optionally, if the user chooses to specify it that way.  (This is
necessary, because not all SVG authoring tools fully support all the
metadata.)

If there is an extra step it can take that will convert those
form-entered metadata into UTF8, would that solve the problem?

I bet there's a module on the CPAN for this...

>> Alternatively:  does RDF allow for non-ASCII characters in the
>> metadata to be encoded as entities?  Could we just use something along
>> the lines of HTML::Entities to encode it (so that e.g. the problematic
>> character in the file in question would become é or somesuch)?
>
> I'm not sure that you can use é in XML, but you can use a numeric
> character reference (in this case é or &#xE9;).

I will test that.

-- 
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/




More information about the clipart mailing list