[Clipart] bulk upload

Roan Horning roan at horning.us
Fri Nov 23 18:19:17 PST 2007


Hi John,

I just downloaded the "Aaron Johnson.zip" file from your .Mac page. It 
looks like the the svg files have embedded in them the information that 
is in the <image name>.txt files. Did you notice this in your clean up, 
or was this part of your clean up process? If the information is already 
in the files it would make it slightly easier to automate the process.

We could minimize the pain by splitting the uploads into groups. Active 
artists with only a few images in the old archive. These people we could 
encourage do any touch up work needed and reload the files themselves.

Active artists with bunches of work to upload. We batch upload the files 
and have them added to their account.

Inactive artists, we could create accounts for each of them with some 
sort of designator in the account name that they are unclaimed/inactive, 
and a notice in their profiles. The site librarians could have access to 
these accounts, so they can clean up the files as needed. If someone 
comes along to claim the work, we can either give them control of the 
account, or have them create a new account and upload the files to it.

You've done a great job of splitting the files up by author. We need a 
script that takes the list of inactive artists and loops through it, 
calling the code that creates a new account. Once we have the new 
accounts created and their associated directories, we can unzip the 
archives into the proper accounts. Then we need another script that can 
read the file information and put it into the right tables in the 
database. I'm happy to work on this. It would be great to have all the 
older work available to the new site, and a shame to have your efforts 
languish in limbo.

I think I have all the access that is needed, and can test everything on 
my own cchost setup, and the test server, before doing anything on the 
production site. We need a consensus on the best way to reflect that the 
artists account has been automatically added to the site. Does anyone 
see any problems with this plan?

--Roan


John Olsen wrote:
>> ------------------------------
>>
>> Message: 7
>> Date: Fri, 23 Nov 2007 09:44:48 -0700
>> From: "Gerald Ganson" <Gerald.Ganson at rdc.ab.ca>
>> Subject: Re: [Clipart] clipart files
>> To: <clipart at lists.freedesktop.org>
>> Message-ID:
>> 	<5864D53DE837C64BBDB65FDBF3E7DE250460D1 at oceanus.RDCSRVCS.ADS>
>> Content-Type: text/plain;	charset="us-ascii"
>>
>> -----Original Message-----
>>
>> If we manage to put together all the files from both the old and  
>> the new
>> site, we may get to a very large size, so we may reconsider what a
>> release is and provide different packages. But before this, we really
>> must put all SVGs in one single place.
>> A feature which could help *a lot* would be batch import (personally I
>> am reluctant in re-uploading some hundreds of files manually, one by
>> one).
>>
>>
>> nicu :
>>
>> ---------
>>
>>
>> FYI - any mass import will now have to be capable of filtering out
>> specific submitters as well, since I've taken this opportunity to  
>> clean,
>> arrange, and fix some things in some of my older submissions, now  
>> that I
>> am getting more skills in SVG.
>>
>> Mine are almost completely re-uploaded to the new site, so it will be
>> best to filter mine out in a batch upload.
>>
>> Gerald_G
>>
>>     
>
>
> If someone has an idea how to bulk upload the old archive and retain  
> all the existing tags and authorship then I haven't seen it offered  
> up here.   The archive has sat there a long time without anything  
> being done to move it to the new site.  It doesn't seem trival when  
> one thinks about it.  As we have the current site set up the author  
> would need to either be linked to an existing "person" on the current  
> site or have one generated based on the name in the author field (but  
> under a bulk upload it would be an unverified account).  This seems  
> like the kind of fuzzy logic not best down by an automated process.
>
> Given the situation I started doing some manual cleanup of the  
> archive.   I'm not a coder, just a design guy willing to put in some  
> volunteer time. Eventually I separated these into author files which  
> I have posted ( http://tinyurl.com/2bu2ue ).  I was encouraged to  
> spearhead the movement of the files to the new site by Jon.  Much  
> talk has gone on about this but no bulk solution offered.  Even if  
> one was offered that solves a lot of the issues with a bulk upload,  
> these files still need to be reviewed and cleaned up by some  
> librarian here.  As Gerald points out, a lot of the old files could  
> do with some cleanup, need better tags or document boundaries.  So a  
> lot of manual labor is going to be needed.  We librarians can upload  
> our own stuff and at the same time remove its unchecked status.   
> Anyone with admin privileges can help the work load by uploading  
> their old files and checking them in themselves.
>
> I understand that a few people like Nicu have a significant amount of  
> work to upload.  So did Gerald.  But most of the 400 or so different  
> authors there have very reasonable amounts of work.  The majority  
> being just 1-2 files. http://openclipart.org/wiki/Authorlist I  
> imagine many of these people are long gone and not on this list.  I  
> don't know what the plan is for those people, but at some point maybe  
> we will have a bulk upload option and just place anything that hasn't  
> been uploaded by current users.
>
> If we use the Roadmap and track who has uploaded their folder full of  
> files then those could easily be pulled out prior to any bulk  
> upload.  So I think that would help with Gerald's concern.
>
> Related to this, Jon can we set up an admin account that is "Public  
> Domain" or Anonymous" so we can upload the files that have no clear  
> author?
>
> John Olsen
>
>
> _______________________________________________
> clipart mailing list
> clipart at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/clipart
>   




More information about the clipart mailing list