[Clipart] SVG sanitization

Andrew Archibald andrew.archibald at sympatico.ca
Tue Mar 22 13:07:53 PST 2005


Hi,

I now have a script that sanitizes SVG, removing script tags and attributes and 
reporting their presence.

It has a few limitations:
* Some broken SVG files cause it to barf; there are 32 such broken files in the 
0.11 release.
* It can't intelligently sanitize files containing Adobe- or Microsoft-specific 
SVG, of which there are 46 in the 0.11 release.
* Its interface is a little awkward (input on stdin, output on stdout, error on 
stderr, return value signals OK/not OK, one file at a time, no command-line 
options).

It could be made to do more, including removing or flagging the presence of 
proprietary extensions, verifying that the license claims to be PD, checking 
for and rearranging metadata, or cleaning the kitchen sink, with sufficient 
programming effort. (The first two would not require much).

I think it is important to incorporate the script into the upload process; I 
think thumbnails should also be rendered on the server, using inkscape, as part 
of that process, as well as extracting and displaying metadata so that users 
can verify that the files look the way they're supposed to and are tagged 
appropriately.

I will also be trying to get another version of the script to be used on 
Wikipedia so that they can serve up SVG files.

Andrew



More information about the clipart mailing list