[Clipart] cchost and svg status update

Bryce Harrington bryce at bryceharrington.org
Thu Aug 17 13:21:08 PDT 2006

On Thu, Aug 17, 2006 at 07:04:08AM -0400, Roan Horning wrote:

Very nice work with the SVG support in ccHost!  It's great to see this
in play.  :-)

> I figured out how to work with cvs.freedesktop.org, but I've only worked
> with the experimental module to update getid3.  Are you using
> cvs.clipart.org to keep track of changes to the main site? I browsed
> through the cvs clipart structure using the web interface, but wasn't
> sure exactly what was what. I'm happy to update the cchost stuff on
> openclipart.org just not sure how yet.

The CVS clipart module is old and obsolete IIRC.  We'd used it early on
in the project but found it too cumbersome once we got more than a few
images.  (Maybe someone started using it again recently, but I don't
think so...)

> > I think the next most important step is to get our current collection
> > into ccHost. This would be best brought up on the cchost list (and cc'd
> > on this list). There are several ways to deal with this, including
> > develop basic import of content from an RSS or ATOM feed (since we
> > already have basic dumping capability, Victor and I discussed adding
> > import of RSS or ATOM for ccHost.
> >
> I've been thinking about how to get the current collection into cchost.

Will you be starting from the last official tarball?  If so, then if
cchost can provide a changelog or diff from that, then this would
address the request from some distros and users to be able to get an
update instead of having to re-download the whole tarball.

Also, if you do start from the last official tarball, it might be wise
to just go ahead and switch the site over to using ccHost at that point,
because then others can work on getting the current incoming queue
reviewed and imported as time permits.  This'd get OCAL up and running
on ccHost more quickly.  (Processing the incoming queue is going to be a
good bit of work, and someone may wish to do this prior to uploading it
into ccHost, since there may be a lot of invalid stuff that would be
better to avoid risking breaking ccHost on...)

> I suggest we use cchost's normal way of putting things in the people's
> folder. So we create default usernames from an artists full name:
> Jack Smith -> Jack.Smith or Smith-Jack or jsmith etc. and enter whatever
> other contact information we have for them--particularly e-mail
> addresses, and generate a default password. Once we are ready we can
> send an individual welcome to the updated ocal email out to all our
> previous contributors, with their new username and password.  We'll have
> to figure out what the policy is for accounts with no email addresses.

It might be best to set aside contributions that don't have an email
address associated with it for now.  I'm not sure if the public domain
statement for those works are going to be legally certain enough if
there's no email address.

Also, it would be worthwhile when you send out those emails to plan on
tracking bounces; if the email isn't valid, then the same question
arises, although to a lesser degree.

Then later on when someone has a chance, they can go through and see
what can be salvaged (e.g., maybe the email was listed elsewhere, or can
be figured out from the author's name, or something).

If these questionable images *are* put into ccHost, make sure to at
least tag them, so they can be filtered out if someone wishes to only
look at 100% unquestionably valid clipart.

> For each piece of artwork, we assign tags based on its current keywords,
> plus keywords based on the path it currently resides.  For example,
> artwork in the  home / food / meats_and_eggs section
> (http://www.openclipart.org/cgi-bin/navigate/food/meats_and_eggs) would
> get the tags 'food' and 'meats_and_eggs' along with the other tags
> derived from the artwork's current keywords. The advantage to these
> "path" tags is we can use them to recreate the current directory
> structure when generating future openclipart releases (i.e. create
> directory structure /food/meats_and_eggs, search for tags
> food+meats_and_eggs, copy search results to directory), plus I think it
> is important to have a way to categorize and browse the collection from
> within an hierarchical structure--it complements searching.

Yup, that's definitely the right approach.

> During the transition, we should make sure the Creative Commons rdf is
> updated for each artwork, in particular the Public Domain Dedication.
> I'm assuming that while not all of the artwork on the site currently
> says it is in the Public Domain when viewing a web page, that all are.

Hmm, if there are any that have questionable public domain dedications,
these may also be worth setting aside for more carefu lreview later on.

Basically, I think the project will be benefitted the most if during the
transition that we start from as close to 100% unquestionably valid
stuff.  It's going to be challenging enough to manage just the valid
stuff, so being able to set aside questionable stuff will save time.  As
well, I suspect ccHost's admin tools may not be sufficiently up to snuff
to deal with questionable images, and it'd be better to keep them aside
either until those tools are proven reliable or until someone has a
chance to manually review/fix the stuff.  It'd suck to have a bunch of
broken images get into the pool and out to users; there is more than
enough clipart already, so better to have a smaller collection of 100%
good stuff, than a larger collection tainted by questionable items.


More information about the clipart mailing list