[Clipart] XML hierarchy definitions

Bryce Harrington bryce at bryceharrington.com
Wed Sep 29 23:18:41 PDT 2004


On Mon, 27 Sep 2004, Jonadab the Unsightly One wrote:
> Bryce Harrington <bryce at bryceharrington.com> writes:
> 
> > "Unwarnock"?
> 
> Warnocking is the opposite of bikeshedding.  Bikeshedding is when
> something is simple enough and unimportant enough that as soon as
> someone mentions it everyone and his brother trots out an opinion and
> you get a long thread:  "what color should we paint the bike shed".
> Warnocking is when something is important, but someone brings it up
> and everyone just waits for someone else to say something, and so it
> promptly falls off the radar.
> 
> Unwarnocking is when you bring a warnocked topic back up and talk
> about it.

Huh.  What's the entomology?  Neither google nor wikipedia recognize the
word.

> > What we're going to need is a routine that takes this xml hierarchy
> > and pulls the corresponding items from the dms into a matching
> > directory structure.  
> 
> Directory structure?  I thought we wanted the various tools (browse
> and so on) to read from the DMS directly?

Correct.  For the monthly release creation tool, we need a routine that
read from DMS directly and use the xml hierarchy to produce a directory
structure that can be tarballed.

> Or maybe you meant just a data structure that's like a directory tree,
> in that it's heirarchical?

No, I meant literally a hierarchical directory structure.

> > This should also include options for specifying what to pull - for
> > instance, whether to pull just the svg's, or to pull thumbnails too,
> > and if so, which size to grab.
> 
> Right.  So as arguments you're going to feed it an XML heirarchy (as a
> string most likely) and some options (in a hash probably).  And it's
> going to return...  a nested data structure whose leaf nodes are
> handles or magic tokens that can be used to retrieve files from the
> DMS, is that about right?

That's a step in the direction, but we need to flatten to directories
and files.

> So the calling code would do something like this...
> 
> my $stuff = whateverwecalltheroutine( $XML_heirarchy_en,
>    { fetch_SVG => 1,
>      fetch_PNG => 0,
>      include_thumbnails => [ 64, 128 ],
>      include_metadata   => [ 'title' ] });

Yes

> And then $stuff might look something like this...
> 
> [
>   { categoryname  => 'Animals',
>     subcategories => [
>                       { categoryname => 'Mammals',
>                         images       => [
>                                           { title => 'Gorilla',
>                                             token => 'id12984',
>                                             thumbnails => [ 64 => 'id12986',
>                                                             128 => 'id12987' ] },
>                                           # ... other mammals go here
>                                         ],
>                         # If Mammals had any subcategories they'd go here.
>                       },
>                       # Other Animals subcategories go here.
>                      ],
>     images        => [ # Images loose in the Animals category go here.
>                      ],
>   }
> ]

Not sure about this structure...  We may need something a bit closer to
a straight hierarchy.  

> > Possibly a perl module already exists that does some of this...
> > Anyway, maybe it's too advanced for us right now, but if you think
> > you could hack something into existance, I could help out with the
> > bits to pull items from the dms.
> 
> I haven't really looked at the DMS in any depth yet.  Back when I
> wrote up that sample XML, I was still thinking about a browse script
> looking through all the SVG images stored in a single directory to see
> which ones have certain keywords, and presenting them in the
> categories that have those keywords in the XML.  That obviously won't
> scale well, though.  The DMS is the better way to go.  It seems like
> your work on that is turning out to be the real heavy lifting of the
> project.  Some more of us should probably look at your DMS code and
> see if we can help out with that too, because long-term I suspect the
> DMS may be the bulk of the programming work.  The web interface (which
> is most of what I've been working on, and a couple of others too) is
> the easy part, ultimately.

Yep, one of the reasons I've focused so exclusively on the dms is
because it seems to be the keystone for making everying else work
smoothly.  

The core idea here is to establish a modular distinction between
"interface logic" and "business logic".  Ideally, if your business logic
is encapsulated behind a good API, then you can easily create PHP, Perl,
cmdline, GUI, email, etc. interfaces to it with very little overlap or
duplication of effort.  I'd like it to be possible for people to
interact with the Open Clip Art Library using a rich set of commandline
tools, while at the same time making it convenient for someone to create
a spiffy KDE or GNOME GUI program to navigate/add clipart.

If done right, the dms's code should be fairly simple and
straightforward; it should simply provide the base atomic operations for
interacting with the documents.  The programming work is not so much
achiving large number of lines of code, but rather making sure the
limited lines of code in it are as close to perfect as possible.

And yes, the dms may end up totally gutting the web interfaces, and
requiring some rewrites, but in a good way! - the interfaces can focus
on interfacy type things, and cease worrying about interoperating with
code in interfaces developed by colleagues.

As to status of the dms, I've been investigating how to set up the
daemon to get invoked as an init.d script (with pid files and such).  I
got it solved for gentoo and am 80% through getting it to work on
RedHat-style systems.  I started investigating how to do authentication
(so we can track which user submits stuff), but all of the examples have
shown how to do it if you're running the server as a CGI; since I'm
implementing this as a daemon, none of that applies so I'm still seeking
solutions (I'd like to avoid implementing my own authentication system).

Bryce



More information about the clipart mailing list