Bug 19449 - Ridiculous number of duplicates
Ridiculous number of duplicates
Status: RESOLVED NOTOURBUG
Product: openclipart.org
Classification: Unclassified
Component: clipart
unspecified
Other All
: medium normal
Assigned To: default user for a product
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-07 13:43 UTC by Jeff D. Hanson
Modified: 2010-08-18 03:24 UTC (History)
1 user (show)

See Also:


Attachments
Listing of duplicate files (669.95 KB, text/plain)
2009-01-07 13:43 UTC, Jeff D. Hanson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff D. Hanson 2009-01-07 13:43:22 UTC
Created attachment 21767 [details]
Listing of duplicate files

I installed the Windows version of the library and noticed a lot of duplicates when adding them to the OpenOffice.org Gallery.  I then downloaded openclipart-0.18-full.tar.bz2 and did a duplicate check (on Ubuntu) and found thousands.  The commands I used are below and were run from the clipart directory:

find . -type f -exec md5sum '{}' \; >md5_listing.txt
sort md5_listing.txt | uniq -d -w32 | cut -c 1-32 >md5_duplicates.txt
grep -f md5_duplicates.txt md5_listing.txt | sort >duplicates_listing.txt

My report is attached.
Comment 1 Tollef Fog Heen 2010-08-18 03:24:14 UTC
Closing all openclipart bugs as openclipart is now on launchpad, as per request from  Jon Philips.