Bug 3354 - FEATURE REQUEST: no scanning for malware
FEATURE REQUEST: no scanning for malware
Status: RESOLVED NOTOURBUG
Product: openclipart.org
Classification: Unclassified
Component: tools
unspecified
x86 (IA32) Linux (All)
: high normal
Assigned To: default user for a product
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-20 11:10 UTC by Andrew Archibald
Modified: 2010-08-18 03:24 UTC (History)
1 user (show)

See Also:


Attachments
python script to remove javascript from SVG files (4.76 KB, text/plain)
2005-05-20 11:12 UTC, Andrew Archibald
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Archibald 2005-05-20 11:10:12 UTC
SVG can contain javascript; few viewers currently execute this script, and all
(?) currently run it in an untrusted environment even if the file is loaded from
a local filesystem, but we can expect this to change as useful scripts appear in
SVG files.  In any case there are security holes in the script-executing
viewers. Thus it is possible for SVG to contain malware.

Currently, OCAL happily stores and redistributes SVG without any kind of
verification method, manual or automatic, to check for malware in SVG.

It is possible to write a script which simply rejects any script-containing
image; attached is a script which does so, although it is limited by the
presence of non-SVG XML in SVG files (such as inkscape-specific XML, metadata,
and Illustrator-specific XML, none of which can be reliably sanitized).
Comment 1 Andrew Archibald 2005-05-20 11:12:45 UTC
Created attachment 2726 [details]
python script to remove javascript from SVG files

This python program takes an SVG file on stdin and reports success if it is
free of script elements and failure if it has some; it emits an SBVG file with
scripts removed on stdout. 

It cannot deal reliably with non-SVG XML; XML from totally unknown namespaces
is rejected, but there may be scripting elements in the Illustrator namespaces
(for example) that I don't know about.
Comment 2 Bryce Harrington 2005-05-20 14:00:49 UTC
Here's an idea how this could be used in conjunction with dms: 
 
For each document ID that has not been tagged "malware-free" in dms: 
    retrieve the $file.svg of document ID from dms 
    Copy $file.svg to $file.svg.orig 
    Run script to remove javascript from $file.svg 
    Insert ($file.svg, $file.svg.orig) into dms as new revision of document ID 
 
Note that as of dms 0.12, the function to retrieve a given doc ID is not 
implemented, however the insert function exists.  See submit_clipart script 
for syntax. 
Comment 3 Andrew Archibald 2005-05-20 14:37:43 UTC
Removing Javascript from images that contain it is likely to render them
useless.  I think that such images should probably simply be removed from the
DMS (perhaps until they can be examined by hand).  

It would also be good if this screening, as well as a screening for portability,
happened on initial upload. 
Comment 4 Bryce Harrington 2005-05-20 14:47:59 UTC
We wouldn't want to automatically remove them from dms.  Instead the script 
should set their state to 'PROBLEM' with a comment attached that they need 
reviewed manually, and add a keyword 'javascript'.  Then, as reviewers have 
time they can download and check the file, and if it looks ok, they can 
manually put it into 'ACCEPTED' state with a note about what the javascript 
does. 
 
This way, we can ensure only files with safe javascript is present in our 
packages, and via the javascript keyword we will have the ability to produce 
packages with all javascript-bearing svg's excluded if we wished. 
Comment 5 Andrew Archibald 2005-05-20 20:22:24 UTC
For the script itself, it would not take long to create a more secure version
based on a RELAX NG specification of SVG (there are several quasi-free ones out
there) which would validate the SVG against a schema which did not permit
scripting.  At the same time it would check for bad SVG (things like rectangles
without coordinates and so on), use of other namespaces (Inkscape, illustrator,
and so on), and use of external resources (this could be a security issue also). 

However, I think it's better for the moment to work on getting some kind of
validation as part of the library. Upon reflection, I like the idea of having a
"malware-free" flag, as this allows easy scanning of old images, manual sorting
out of problem images, and so on.  I still think a step in the upload where the
uploader is asked "does this image look right?" is a good idea, and the
malware-scanning could happen at this time so that they know right away if
there's some problem with their image. 

So, I think this bug is waiting on the DMS: once the DMS is written, a quick
SOAP hack ought to do the job; integrating the malware scanner into the upload
script can happen any time after that.
Comment 6 Andrew Archibald 2005-05-20 23:01:44 UTC
detect-script.py is now in CVS for the tools directory; it searches the files on
the command line for scripts and prints out a list of the files that contain
scripts.  Has non-zero exit status if it finds any scripts. 
Comment 7 Nathan Eady 2005-05-23 11:28:59 UTC
I like Bryce's thinking in Comment 4.  Rather than removing anything from DMS,
just flag it so it's not included in packages (and perhaps not presented in the
CGI browser interface unless the user specifically asks for such images to be
included).  Then if people have time to manually check them they can do so, and
if they don't, they can sit in the DMS and wait until someone _does_ have time.
Comment 8 Tollef Fog Heen 2010-08-18 03:24:02 UTC
Closing all openclipart bugs as openclipart is now on launchpad, as per request from  Jon Philips.