[Xesam] Metadata Storage Daemon

Sun Jan 13 02:12:34 PST 2008

On 13/01/2008, Evgeny Egorochkin <phreedom.stdin at gmail.com> wrote:
> В сообщении от Saturday 12 January 2008 23:02:18 Mikkel Kamstrup Erlandsen
> написал(а):
> > On 12/01/2008, Evgeny Egorochkin <phreedom.stdin at gmail.com> wrote:
> > > В сообщении от Saturday 12 January 2008 01:05:38 Mikkel Kamstrup
> > > Erlandsen
> > >
> > > написал(а):
> > > > On 11/01/2008, Sebastian Trüg <strueg at mandriva.com> wrote:
> > > > > Just my 2 cents:
> > > > > Soprano has a IMOH very good DBus API [1] for RDF storage which
> > > > > fulfills all 3 of your requirements below. We already use it for
> > > > > Nepomuk and it works great. And since Xesam is already using URIs to
> > > > > identify stuff why not go the extra mile to RDF storage altogether?
> > > >
> > > > I thought Soprano depended on Qt?
> > >
> > > This is not a dependency that you can't easily get rid of.
> > >
> > > > Anyways, I don't think the RDF quadruples is a good thing to expose
> > > > directly to the programmers who just want a quick and dirty metadata
> > > > storage. It is simply just too technical. That does not mean that we
> > > > cannot use that stuff under the hood though.
> > >
> > > Which part of ( URI, property name, property value , timestamp )
> > > programmers can't understand and why should it be hidden?
> >
> > Exactly my point :-)  ( URI, property name, property value , timestamp
> > ) is fine, but exposing the general Named Graph terminology (and
> > features)
>
> Actually there's nothing more to named graphs than another element added to
> the triple. So you can differentiate named graphs with namespacing like
> mtime:/ uri. Using name graphs only for mtime might backfire in the sense
> that named graphs could be used in other ways like to store provenance
> info(where speicifc triple came from).

That is exactly one problem I have with named graphs. It seems kind of
arbitrary to allow exactly one "name" per graph. It kind of begs you
to stick on XML blob in it with both mtime, provenenance, and your
shoe size in centimeters.

> > in the API is too generic to my taste. If we say that the
> > triple name is always a timestamp I am ok with it.
>
> Actually generic API is the only one that's really needed, because it is the
> most powerful. This doesn't exclude having a set of convenience functions to
> do typical queries or even completely hide the RDFish and SPARQLish nature of
> the matter for certain users of the technology.

The most powerful and generic API is not always the right one to
expose. You have to design the API so that the consuming programs also
get a lot of expressive power and clarity. That is rarely a quality of
totally generic interfaces.

Consider the following lines of code could be the same:

double val = item.getValue();
double size = shoe.getShoeSize();

If you are writing stuff that should really be generic (ie a generic
RDF backend) then 1 is fine. If you are writing an application to
manage a shoe store 2 would likely make things a lot clearer. Ofcourse
this example is exaggerated, but I think the idea is clear.

Our target is "Metadata Storage for the Desktop". Not "Generic Named
Graph Storage" and our API should reflect this.

If absolute generic'ism was the best thing in the world all APIs would
consist of var arg functions:

Object get (Object obj, ...)
Object set (Object obj, ...)

Cheers,
Mikkel