[Clipart] bug in upload_svg.cgi or in Inkscape?

Jonadab the Unsightly One jonadab at bright.net
Wed Jul 21 06:56:17 PDT 2004


Nicu Buculei <nicu at apsro.com> writes:

> when i want to open some files from the 'Incoming' folder in Inkscape
> i got the error 'Failed to load the requested file'.
> two examples are:
> http://www.openclipart.org/incoming//open_clipart_library_logo_suggestion_01.svg

Yes, I get the same error with this one.  I'll treat it as a test case
and try to determine if there's a specific element or attribute
causing the problem.

> http://www.openclipart.org/incoming//flourish_three_lower_right_corner_01.svg

This one opens fine for me in Inkscape 0.38.  Also, when I look at it
in a text editor, I don't see the garbage line you mention.

> - Eye of Gnome displays them OK

That may or may not mean anything, since some apps are just
permissive, so I went looking for, and found, an SVG validator:
http://jiggles.w3.org/svgvalidator/ValidatorURI.html

When I ran the second one (the one that opens in Inkscape just fine)
through it, it reported a bunch of attribute namespace URI errors (the
foo element does not allow any attribute whose namespace URI is blah,
all URIs from either sodipodi or inkscape websites).

However, when I ran the first one (the one that gives the error in
Inkscape) through the validator, I get this:

Character conversion error: "Malformed UTF-8 char -- is an XML
encoding declaration missing?" (line number may be too low).

Gah, character encoding errors are a pain.

I suspect this error means that there's a non-ASCII character in the
RDF that isn't allowed in the character set that the SVG file was
using, or something along those lines.  Probably the Á.  I tested this
theory by making a copy of the file and changing the three occurrances
of that character to A, and Inkscape was then able to open the
resulting file without any problem.

So now the question becomes:  what should the upload script do when
the metadata it's adding don't adhere to the character encoding that
the original SVG image is using?  There are several choices...

 1. Ignore it and hope for the best.  This is the current naive behavior.
 2. Detect it and return an error message to the user.
 3. Detect it and remove or alter the offending characters.
    (Is it possible to re-encode them?  How?)
 4. Detect it and attempt to change the encoding of the parent document.
 5. Detect it, but only change the encoding of the metadata element,
    if this is possible.  (I'm not at all sure XML provides for this
    option; I haven't done a lot of work with character encodings.)

I don't like any of these options.  The XML spec should have specified
a default encoding that doesn't create these problems.  That's not
something we can easily change, though, so we're going to have to work
with the XML spec we've got.  I'm very open to suggestions about which
of the above less-than-altogether-ideal things we should do, or
whether there's a sensible sixth option.

I suppose the approaches could be combined, e.g. by asking the user
what to do.

This is also complicated by the fact that I personally know next to
nothing about unicode.  I'm accustomed to dealing with all-ASCII data
for the most part.  I don't really know how to go about detecting an
encoding mismatch.  Is there someone with better unicode-fu on the list?

-- 
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/




More information about the clipart mailing list