[Clipart] Malware in clipart
andrew.archibald at sympatico.ca
Tue Mar 15 22:04:44 PST 2005
Jon Phillips wrote:
>>>It seems to me that we will not have the resources to hand-examine
>>>every submission to ensure it is innocuous, so (barring an
>>>earthshattering breakthrough in AI research) if we take any
>>>precautions at all it will have to be stripping out all scripts of any
>>>kind, malware or not. (Which, on the whole, doesn't sound like a
>>>terribly bad idea to me... feel free to jump in and explain why we
>>>shouldn't do that, if you can think of any solid reasons.)
>>Another approach could be to simply slap a keyword on it, e.g. 'script'
>>or 'executable', and exclude all such images from the releases. Then,
>>if people are so motivated, they can individually review/approve them.
>>There's four reasons for this suggestion:
>>First, presumably if an SVG includes a non-malware script, it's probably
>>there for a reason, such as for animation. In this case, removing the
>>script may invalidate the image, in which case a "stripped" version
>>could be worthless anyway.
Yes and no. Currently, barely anything supports scripts, so this is highly
speculative. But if it's like the Web, there are enough non-script and
script-disabled browsers that a reasonable fallback will often be provided.
>>Second, if someone goes to the trouble of writing a script AND
>>submitting it to OCAL, and then the script gets stripped out, my guess
>>is that they're going to come and complain. Having a procedure that
>>allows you to review/approve individual images on a case-by-case basis
>>will enable the project to handle these situations professionally, and
>>not as "special exceptions".
If we make a "validation" step part of the process (upload it, look at a
rendered PNG, look at the final version, approve) then it will catch both
people whose scripts are mangled by this process and people subitting
non-inkscape scripts that look wrong when rendered by our de-facto standard
People who intentionally add scripts expect them to stay. However, programs
may start adding "helpful" script elements which don't do anything useful but
trigger our detection; in this case automatic removal is a good idea.
>>Third, this approach remains consistent with the process for handling
>>other types of abusive images, so hopefully would reduce the variety of
>>scripts needed to be written/maintained.
A "validate and normalize" script seems like a good idea, enforcing the
presence of metadata (and maybe a formalized PD declaration) and various
security invariants; we could also give people a chance to view their image
rendered under inkscape (and possibly some other renderers) to catch partially
>>Finally, if in fact a given SVG image also has a piece of malware
>>scripted into it, how likely are we going to want to keep the image
>>portion? Stripping the viral part of an email virus wouldn't turn it
>>into a useful email. ;-) I wouldn't think stripping malware from SVG
>>would result in a worthwhile SVG image, either.
If it actually has malware, probably not - although if that malware is an
actual virus, it may have been added after the fact to a perfectly normal SVG file.
>>Again, I think instead of stripping, just flag the images and filter
>>them out for the releases. This should be a simpler thing to do, and
>>requires only small tweaks to the existing tools, plus a simple script
>>to scan the SVG for keywords ("<SCRIPT...", etc.) and if it comes up
>>positive, move those files aside, and/or add a keyword to them. Then,
>>in the release scripts, add another filter like the one for the flags,
>>to exclude images with scripts in them.
> Yes, I agree with this approach. I'm adding to the roadmap. Does anyone
> want to conquer this task? Andrew, would you like to conquer this one?
> We should figure out how to add this to validation of SVG files once
> input into the site and then again when doing a release. We should
> discuss this further? Any specific suggestions from anyone on
> Please check roadmap: http://www.openclipart.org/cgi-bin/wiki.pl?Roadmap
> We are in need of some good soldiers to help with these tasks.
I don't know about "conquer", but I did improve my script; I attached it to
some other list email, but see also
http://en.wikipedia.org/wiki/User:Aarchiba/SVG_sanitizer (where I will update
it with various versions).
It takes an SVG file on standard input; it sends a sanitized version to
standard output, and returns a nonzero exit status if anything dubious was found.
It DOESN'T do anything sensible if the document is not pure SVG; if you've used
namespaces or included some other kind of XML, all bets are off (see the
source). So it's not ready for prime time. I need to read more about XML
before I can do that.
More information about the clipart