[Clipart] Comma-separated keywords in metadata
Bryce Harrington
bryce at bryceharrington.com
Mon Jun 28 10:48:34 PDT 2004
On Mon, 28 Jun 2004, Jonadab the Unsightly One wrote:
> > The proper way to achieve this in RDF is with a 'rdf:Bag':
> >
> > <dc:subject>
> > <rdf:Bag>
> > <rdf:li>yield</rdf:li>
> > <rdf:li>sign</rdf:li>
> > <rdf:li>yield sign</rdf:li>
> > <rdf:li>street sign</rdf:li>
> > <rdf:li>traffic sign</rdf:li>
> > </rdf:Bag>
> > </dc:subject>
>
> It is possible to automate this conversion, assuming we don't have to
> manually check for commas being used in other ways in the subject.
With Perl, all such things are possible. ;-)
> (Even then, a command-line tool could prompt for each file:
>
> Image: yieldsign.svg
> Subject: yield, sign, yield sign, street sign, traffic sign
> Convert to rdf:Bag? (Y/N):
>
> Are there a lot of images with this issue, or just a couple?
By and large, *most* people did not include metadata, so by definition
there are not a lot of images with this issue. Of those that did
include metadata, most either used the tool on the CC site (which
doesn't have a field for subject), or my svg_annotate (which also lacks
the subject capability). I don't know how many pieces of art included
the subject, but did see that there are some. I think this is good, as
it gives us a starting point to work on for the categorization work in
this next release.
If we can get svg_annotate, the upload tool and Inkscape to support this
Bag-oriented approach, I expect we'll be able to take care of the
keywords for 99% of the contributions. For that last 1%, a script may
be worthwhile, but we can probably leave it a lower priority. If we
start getting lots of clipart with keywords in the subject, we can look
into it then; it's likely that the submissions would use irregular
formats (commas, space-delimited, semicolons...) so would need special
parser attention anyway.
Bryce
More information about the clipart
mailing list