[Clipart] clipart to cchost

momo momo at lumenstudio.net
Thu Jan 4 05:23:53 PST 2007

> momo wrote:
>> I'm not quite sure that this is the simpliest way. In CCHOST there are 
>> still no thumbnails and no library browser (lots of thumbnails on one 
>> page) so chasing duplicates will be a very long and difficult task 
>> because it will require to open each single SVG file, and remember if 
>> this file resembles to something already listed in the library.
>> I have already been in a situation like this in the everyday cleaning 
>> process I'm involved in, when an uploader (Machovka) reuploaded several 
>> duplicates of cliparts he uploaded a week before. I was lucky to remember 
>> that I already saw these cliparts before, and managed to find and delete 
>> the duplicates.
>> Unfortunately CCHOST don't have instruments to easily locate duplicates 
>> (no thumbnails and no thumbnail browser).
>> This is actually one of the reasons why I wanted to clean the 0.18 
>> collection (and entries that came after this release) locally (on my 
>> computer) before importing it to CCHOST. Other reasons are:
>> - no need to reupload cleaned/improved files (uploading files to cchost 
>> is painfully slow and sometimes uploads even fail...)
>> - faster cleaning (no need to browse the uploaded 0.18 collection online, 
>> just browse it locally)
>> - less traffic on the server
> I think is a killer to visually try to identify and clean duplicates for 
> thousands of images, it should be done in a scripted way: compare the file 
> sizes and maybe names (names are not reliable) and do a visual comparison 
> only for the files with matching file sizes and/or names.

Yes you're right, but this is actually a task that can be done locally with 
the help of a duplicate finder (I use a freeware program called DupKiller 
for these tasks).
In the future, it could be great to implement into CCHOST a script that will 
compare newly uploaded files to existing ones and in case it finds a 
duplicate (same filename and/or size) it will alarm admins to check these 
files (new one vs old).
But still, for the task of importing 0.18 collection, I'm still thinking 
that it is much more simple to have the cleaning/enhancing job done locally 
by several people. After the duplicate search is done, each volunteer takes 
a part of the clipart to clean. Once every part of the clipart is cleaned, 
they are uplopaded to CCHOST for people to use and enjoy.

