Many SVGs are invalid XML, all should be validated on submission I have discovered while writing Safari+SVG, that many of the SVG files on OpenClipart.org are in fact invalid xml. Two recent examples come to mind: http://www.openclipart.org/incoming/jerusalem_cross_with_cr_01.svg <?xml version="1" standalone="no"?> is invalid, only 1.0 and 1.1 are allowed. As well as openclipart's own logo: http://www.openclipart.org/logo/openclipartlibrary-logo-only-5colors.svg the prefix "rdf" is never defined yet it's used. My suggestion would be that the submission script should use either libTidy or validator.w3c.org to check every svg before allowing an author to submit.
Please have a look at this wiki page to read more about the usage of the "rdf" prefix: http://openclipart.org/cgi-bin/wiki.pl?MetadataDiscussion
Eric Seidel writes: > I have discovered while writing Safari+SVG, that many of > the SVG files on OpenClipart.org are in fact invalid xml. If you mean valid in the sense of the XML spec, then probably all of them are invalid. XML validity isn't really a useful concept for SVG with embedded RDF. The SVG files ought to be conforming SVG, however, which in particular means that they ought to be well-formed XML and ought to conform to the Namespaces in XML spec. There are currently hundreds that are not conforming SVG. > <?xml version="1" standalone="no"?> is invalid, only 1.0 and 1.1 are allowed. Yes, version="1" is wrong. This is a matter of well-formedness. In fact, I think only XML version 1.0 should be allowed for SVG, as this is what the SVG 1.0 and SVG 1.1 specs refer to. All the SVG files in release 0.17 that have an XML declaration specify XML version 1.0, so this problem does not appear to be widespread. (I haven't downloaded release 0.18 yet.) > http://www.openclipart.org/logo/openclipartlibrary-logo-only-5colors.svg > > the prefix "rdf" is never defined yet it's used. Yes, that's wrong too, because it doesn't conform to the Namespaces in XML spec. None of the SVG files in release 0.17 use undefined prefixes. There were some in a previous release, but they were fixed. > My suggestion would be that the submission script should use > either libTidy or validator.w3c.org to check every svg before > allowing an author to submit. I'm not familiar with libTidy, but validator.w3c.org checks for valid XML and so would reject everything. I don't know of any tool that checks for conforming SVG. My SVGscan script checks for various problems, but it wouldn't have spotted that version="1" (though I've added a test for that now). Andrew Archibald has suggested validating incoming files against a RELAX NG schema, but nobody has proposed a suitable schema yet. Rather than just rejecting bad files, it would better for the incoming script to fix them whenever possible. For example, it could set the XML version to 1.0, add xmlns="http://www.w3.org/2000/svg" to the root element if needed, change 'textpath' elements (an Inkscape 0.42 bug) to 'textPath', etc.
Good to know. The tidy, that I was refering to is HTML tidy, which is available in various incarnations, some of which respect xml, probably none of them do so well. http://tidy.sourceforge.net/ It's good to hear that you have a special script for this. I guess the best thing to do with this bug, is simply fix the two svgs I mentioned, and close. I'll check out svg_validate: http://search.cpan.org/~bryce/SVG-Metadata-0.20/scripts/svg_validate and post any further problems which I feel should be covered by that script as part of other bugs. Thanks!
Actually, I tried using: http://validator.w3.org/ With a couple svgs, and had surprisingly good results. It seems to have trouble with xmlns: definitions, but is still worth at least looking at.
encompassed by the new feature request - https://bugs.freedesktop.org/ show_bug.cgi?id=8627
Mass reopen. The "LATER" resolution is lame, I'm deleting it. Consider LATER to have arrived.
Closing all openclipart bugs as openclipart is now on launchpad, as per request from Jon Philips.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.