[Clipart] [Bug 4743] Many SVGs are invalid XML, all should be validated on submission

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Oct 11 10:09:09 PDT 2005

Please do not reply to this email: if you want to comment on the bug, go to    
the URL shown below and enter yourcomments there.     

------- Additional Comments From sas00003 at btinternet.com  2005-10-11 10:09 -------
Eric Seidel writes:

> I have discovered while writing Safari+SVG, that many of
> the SVG files on OpenClipart.org are in fact invalid xml.

If you mean valid in the sense of the XML spec, then probably all of them are
invalid. XML validity isn't really a useful concept for SVG with embedded RDF.

The SVG files ought to be conforming SVG, however, which in particular means
that they ought to be well-formed XML and ought to conform to the Namespaces in
XML spec. There are currently hundreds that are not conforming SVG.

> <?xml version="1" standalone="no"?> is invalid, only 1.0 and 1.1 are allowed.

Yes, version="1" is wrong. This is a matter of well-formedness. In fact, I think
only XML version 1.0 should be allowed for SVG, as this is what the SVG 1.0 and
SVG 1.1 specs refer to.

All the SVG files in release 0.17 that have an XML declaration specify XML
version 1.0, so this problem does not appear to be widespread. (I haven't
downloaded release 0.18 yet.)

> http://www.openclipart.org/logo/openclipartlibrary-logo-only-5colors.svg
> the prefix "rdf" is never defined yet it's used.

Yes, that's wrong too, because it doesn't conform to the Namespaces in XML spec.

None of the SVG files in release 0.17 use undefined prefixes. There were some in
a previous release, but they were fixed.

> My suggestion would be that the submission script should use
> either libTidy or validator.w3c.org to check every svg before
> allowing an author to submit.

I'm not familiar with libTidy, but validator.w3c.org checks for valid XML and so
would reject everything.

I don't know of any tool that checks for conforming SVG. My SVGscan script
checks for various problems, but it wouldn't have spotted that version="1"
(though I've added a test for that now).

Andrew Archibald has suggested validating incoming files against a RELAX NG
schema, but nobody has proposed a suitable schema yet.

Rather than just rejecting bad files, it would better for the incoming script to
fix them whenever possible. For example, it could set the XML version to 1.0,
add xmlns="http://www.w3.org/2000/svg" to the root element if needed, change
'textpath' elements (an Inkscape 0.42 bug) to 'textPath', etc.
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email         
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

More information about the clipart mailing list