Bug 19449

Summary: Ridiculous number of duplicates
Product: openclipart.org Reporter: Jeff D. Hanson <jhansonxi>
Component: clipartAssignee: default user for a product <clipart>
Status: RESOLVED NOTOURBUG QA Contact:
Severity: normal    
Priority: medium CC: esigra
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Listing of duplicate files

Description Jeff D. Hanson 2009-01-07 13:43:22 UTC
Created attachment 21767 [details]
Listing of duplicate files

I installed the Windows version of the library and noticed a lot of duplicates when adding them to the OpenOffice.org Gallery.  I then downloaded openclipart-0.18-full.tar.bz2 and did a duplicate check (on Ubuntu) and found thousands.  The commands I used are below and were run from the clipart directory:

find . -type f -exec md5sum '{}' \; >md5_listing.txt
sort md5_listing.txt | uniq -d -w32 | cut -c 1-32 >md5_duplicates.txt
grep -f md5_duplicates.txt md5_listing.txt | sort >duplicates_listing.txt

My report is attached.
Comment 1 Tollef Fog Heen 2010-08-18 03:24:14 UTC
Closing all openclipart bugs as openclipart is now on launchpad, as per request from  Jon Philips.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.