[Clipart] SVG Metadata

Christopher Schmidt crschmidt at crschmidt.net
Fri Apr 8 04:55:42 PDT 2005


On Thu, Apr 07, 2005 at 10:15:56PM -0400, Jonadab the Unsightly One wrote:
> crschmidt at crschmidt.net (Christopher Schmidt) writes:
> 
> > Currently, the SVG metadata in SVG files on openclipart.org is
> > broken RDF, as well as being slightly broken in a number of ways:
> >
> > First, there is a snippet of RDF like the following. This is invalid:
> > <license rdf:resource="Public Domain">
> >   <dc:date>28</dc:date>
> > </license>
> >
> > This should be 
> > <license rdf:resource="urloflicense" />
> > <dc:date>date of copyright</dc:date>
> > Which would cause the RDF to be valid. In fact, there is already an
> > empty <dc:date> in the SVG file I'm currently looking at
> 
> I will look into that with my other changes.  I'll try to get parse to
> read it either way, and to_rdf to write it the correct way.  Actually,
> I think the dc:date field maybe was empty because of an unimplemented
> thing that I have already fixed (in my local copy) when I was working
> on some of the TODO comments.  For now I'll set it so that _date and
> _license_date default to one another, the way creator and owner
> already do.

What it comes down to is that there aren't two dates: there's just the
date the work was created, which is the same as the date of the license.
Once you change the above (broken) format to the one I listed, you're
not adding anything additional with the dc:date, because there's already
one within the <Work> tag.

> > Note that the URL for the PublicDomain license is:
> > http://web.resource.org/cc/PublicDomain , as per
> 
> The license-related code in to_rdf recognizes that URI as being the
> same as 'Public Domain', but I don't think it currently changes one to
> the other.  Should it?  Hmmm...  Seems like at least the rdf:about
> attribute of the License element should be the URI, as about
> attributes elsewhere in the RDF are used for URIs.

Yep, rdf:about or rdf:resource should always be relative or absolute
URIs. If you've got a literal, you need to switch to the URI.

> > Personally, I'd probably not put the subjects in a bag: the current
> > movement in RDF is to move away from explicit containers in favor of
> > multiple predicates with the same subject, but I'm not versed in SVG
> > metadata, so that might not be the common way to do it in the SVG
> > world.
> 
> Without a concrete reason, I don't know that it's necessary to change
> that.  Multiple subjects would work, but the bag works too, and I
> think we are not the only project using the metadata in this format.
> (In particular, I think Inkscape does also.)  If other projects were
> using it the other way, of course, we would consider changing to unify
> and standardize on one way of doing it, but otherwise, my feeling is,
> if the way we're doing it is not invalid, let's not change it.

Yeah, the primary problem is with querying against it, but that's mostly
a problem with query languages, not with your specific usage of it, so I
understand wanting to leave it how it is.

> > Third, the creator and rights holder have an rdf:about="" attribute
> > which is blank.
> 
> Those right now are always blank due to two of the TODO items, which
> I'm fixing up (so that they will not have to be blank now), but...
> 
> > This should either point to a URL of the creator and rights holder,
> > or the entire rdf:about="" should be removed. 
> 
> I could do that, too:
>    ($about_url) ? " rdf:about=\"$about_url\"" : ''
> Then if these data are not supplied, they'll be omitted entirely.

Yep. Otherwise, it's equivilant to saying "The creator of this document
is this document", since rdf:about is a relative URL.

> > If the rights are assigned to openclipart.org, this should be used
> > instead. 
> 
> Currently we are defaulting _publisher and _publisher_url to
> 'Open Clip Art Library' and 'http://www.openclipart.org/'
> respectively, but we are defaulting the owner to the creator.
> Although, since the items are placed into the public domain,
> ownership seems largely moot.  Still, our intention is to track
> that information as it was submitted to us, as much as possible.

Despite it being placed into the public domain, it still must have a
copyright holder: someone who is able to assign that copyright. So I
agree, it's important information to keep.

> > The dc:title of the <dc:rights> agent should be the person who
> > created the work, rather than "Public Domain" (I think, although I
> > may be acting as a lawyer there unintentionally.)
> 
> The dc:title subelement of the Agent subelement of the dc:rights
> element is set to _owner.  It is possible some images were submitted
> with "Public Domain" as the owner, but the way I read to_rdf, it is
> putting the owner there, if an owner is known.  (If no owner is
> specified, it defaults to the creator, as noted above.)

Okay, must have been a goofy submitter then :)

-- 
Christopher Schmidt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.freedesktop.org/archives/clipart/attachments/20050408/f2006fb2/attachment.pgp>


More information about the clipart mailing list