[Clipart] SVG::Metadata Release 0.08

Mon Jun 28 11:13:45 PDT 2004

On Mon, 28 Jun 2004, Jonadab the Unsightly One wrote:
> I need to clarify something: assuming the following workflow:
> 
> 1.  Create a new SVG::Metadata object
> 2.  Parse an SVG image with it, creating a metadata object

Parsing just fills in the object you created in #1, not create a new
object.  I thought about doing the parsing as part of the new()
operation, but splitting it out allows for use of SVG::Metadata where
you may not be parsing any documents (such as with svg_annotate).

> 3.  Remove existing RDF from the image
> 4.  Call to_rdf() on the metadata object
> 5.  Add the resulting RDF back into the SVG image

*Nod*

> Is this an essentially lossless operation, assuming we don't care
> about meta-metadata things like whitespace in the XML and where the
> metadata infos are stored within the overall XML document structure,
> but only the actual metadata themselves?

No, it can be lossy, because it deletes the existing RDF and replaces
it.  By and large *most* of the common RDF syntax is implemented so yes,
it will preserve just about everything OCAL cares about, but there are
still other tags that might be in use.  I have been adding them to the
RDF as I notice people using them.

In future releases, as I have time, I want to replace the to_rdf()
routine to only emit RDF of sections it has defined values for, so we
don't bloat up SVG files with a lot of rarely used tags.  Longer term, I
also want to experiment with actually storing the loaded RDF as a
XML::Twig tree, and using that module to write it out.  Then, in that
case, it should be much closer to beings a lossless operation.

> My upload tool assumes that this is a lossless operation.  (It makes
> adjustments to the metadata between steps 2 and 3, but the adjustments
> are deliberate.)  Is that okay?
>
> If this is a problem, then I need to rework the upload script.

Yes, that's fine.  Currently, since few submitters include the metadata,
this will be a significant improvement, even if it is not perfectly
lossless yet.  When I get the features mentioned above implemented, the
issue will diminish further, probably without any need of change in your
script. 

> If it's not, I'm thinking of creating a web tool for adding keywords
> to already-uploaded images (querying them by keyword, with "unsorted"
> being the obvious place to start, and then presenting the user with a
> form for adding more specific keywords to some or all of the results).
> Put that together with a Wiki page documenting our set of keywords and
> what they mean, and we'll have the beginnings of a categorization
> mechanism.  We'll have to add authority control later, but that can be
> retrofitted.

Agreed, this is a good step to take.  I've got some routines in
SVG::Metadata for keyword handling, but they're not hooked up in parse()
and to_rdf() yet.  Once they are, you could use those for doing this
work. 

Bryce