(?) currently run it in an untrusted environment even if the file is loaded from
a local filesystem, but we can expect this to change as useful scripts appear in
SVG files. In any case there are security holes in the script-executing
viewers. Thus it is possible for SVG to contain malware.
Currently, OCAL happily stores and redistributes SVG without any kind of
verification method, manual or automatic, to check for malware in SVG.
It is possible to write a script which simply rejects any script-containing
image; attached is a script which does so, although it is limited by the
presence of non-SVG XML in SVG files (such as inkscape-specific XML, metadata,
and Illustrator-specific XML, none of which can be reliably sanitized).
Created attachment 2726 [details]
This python program takes an SVG file on stdin and reports success if it is
free of script elements and failure if it has some; it emits an SBVG file with
scripts removed on stdout.
It cannot deal reliably with non-SVG XML; XML from totally unknown namespaces
is rejected, but there may be scripting elements in the Illustrator namespaces
(for example) that I don't know about.
Here's an idea how this could be used in conjunction with dms:
For each document ID that has not been tagged "malware-free" in dms:
retrieve the $file.svg of document ID from dms
Copy $file.svg to $file.svg.orig
Insert ($file.svg, $file.svg.orig) into dms as new revision of document ID
Note that as of dms 0.12, the function to retrieve a given doc ID is not
implemented, however the insert function exists. See submit_clipart script
useless. I think that such images should probably simply be removed from the
DMS (perhaps until they can be examined by hand).
It would also be good if this screening, as well as a screening for portability,
happened on initial upload.
We wouldn't want to automatically remove them from dms. Instead the script
should set their state to 'PROBLEM' with a comment attached that they need
time they can download and check the file, and if it looks ok, they can
For the script itself, it would not take long to create a more secure version
based on a RELAX NG specification of SVG (there are several quasi-free ones out
there) which would validate the SVG against a schema which did not permit
scripting. At the same time it would check for bad SVG (things like rectangles
without coordinates and so on), use of other namespaces (Inkscape, illustrator,
and so on), and use of external resources (this could be a security issue also).
However, I think it's better for the moment to work on getting some kind of
validation as part of the library. Upon reflection, I like the idea of having a
"malware-free" flag, as this allows easy scanning of old images, manual sorting
out of problem images, and so on. I still think a step in the upload where the
uploader is asked "does this image look right?" is a good idea, and the
malware-scanning could happen at this time so that they know right away if
there's some problem with their image.
So, I think this bug is waiting on the DMS: once the DMS is written, a quick
SOAP hack ought to do the job; integrating the malware scanner into the upload
script can happen any time after that.
detect-script.py is now in CVS for the tools directory; it searches the files on
the command line for scripts and prints out a list of the files that contain
scripts. Has non-zero exit status if it finds any scripts.
I like Bryce's thinking in Comment 4. Rather than removing anything from DMS,
just flag it so it's not included in packages (and perhaps not presented in the
CGI browser interface unless the user specifically asks for such images to be
included). Then if people have time to manually check them they can do so, and
if they don't, they can sit in the DMS and wait until someone _does_ have time.
Closing all openclipart bugs as openclipart is now on launchpad, as per request from Jon Philips.