SVG can contain javascript; few viewers currently execute this script, and all (?) currently run it in an untrusted environment even if the file is loaded from a local filesystem, but we can expect this to change as useful scripts appear in SVG files. In any case there are security holes in the script-executing viewers. Thus it is possible for SVG to contain malware. Currently, OCAL happily stores and redistributes SVG without any kind of verification method, manual or automatic, to check for malware in SVG. It is possible to write a script which simply rejects any script-containing image; attached is a script which does so, although it is limited by the presence of non-SVG XML in SVG files (such as inkscape-specific XML, metadata, and Illustrator-specific XML, none of which can be reliably sanitized).
Created attachment 2726 [details] python script to remove javascript from SVG files This python program takes an SVG file on stdin and reports success if it is free of script elements and failure if it has some; it emits an SBVG file with scripts removed on stdout. It cannot deal reliably with non-SVG XML; XML from totally unknown namespaces is rejected, but there may be scripting elements in the Illustrator namespaces (for example) that I don't know about.
Here's an idea how this could be used in conjunction with dms: For each document ID that has not been tagged "malware-free" in dms: retrieve the $file.svg of document ID from dms Copy $file.svg to $file.svg.orig Run script to remove javascript from $file.svg Insert ($file.svg, $file.svg.orig) into dms as new revision of document ID Note that as of dms 0.12, the function to retrieve a given doc ID is not implemented, however the insert function exists. See submit_clipart script for syntax.
Removing Javascript from images that contain it is likely to render them useless. I think that such images should probably simply be removed from the DMS (perhaps until they can be examined by hand). It would also be good if this screening, as well as a screening for portability, happened on initial upload.
We wouldn't want to automatically remove them from dms. Instead the script should set their state to 'PROBLEM' with a comment attached that they need reviewed manually, and add a keyword 'javascript'. Then, as reviewers have time they can download and check the file, and if it looks ok, they can manually put it into 'ACCEPTED' state with a note about what the javascript does. This way, we can ensure only files with safe javascript is present in our packages, and via the javascript keyword we will have the ability to produce packages with all javascript-bearing svg's excluded if we wished.
For the script itself, it would not take long to create a more secure version based on a RELAX NG specification of SVG (there are several quasi-free ones out there) which would validate the SVG against a schema which did not permit scripting. At the same time it would check for bad SVG (things like rectangles without coordinates and so on), use of other namespaces (Inkscape, illustrator, and so on), and use of external resources (this could be a security issue also). However, I think it's better for the moment to work on getting some kind of validation as part of the library. Upon reflection, I like the idea of having a "malware-free" flag, as this allows easy scanning of old images, manual sorting out of problem images, and so on. I still think a step in the upload where the uploader is asked "does this image look right?" is a good idea, and the malware-scanning could happen at this time so that they know right away if there's some problem with their image. So, I think this bug is waiting on the DMS: once the DMS is written, a quick SOAP hack ought to do the job; integrating the malware scanner into the upload script can happen any time after that.
detect-script.py is now in CVS for the tools directory; it searches the files on the command line for scripts and prints out a list of the files that contain scripts. Has non-zero exit status if it finds any scripts.
I like Bryce's thinking in Comment 4. Rather than removing anything from DMS, just flag it so it's not included in packages (and perhaps not presented in the CGI browser interface unless the user specifically asks for such images to be included). Then if people have time to manually check them they can do so, and if they don't, they can sit in the DMS and wait until someone _does_ have time.
Closing all openclipart bugs as openclipart is now on launchpad, as per request from Jon Philips.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.