[Xesam] Metadata Storage Daemon

Thu Jan 17 01:07:37 PST 2008

В сообщении от Wednesday 16 January 2008 15:19:52 Jamie McCracken написал(а):
> On Wed, 2008-01-16 at 14:02 +0100, Sebastian Trüg wrote:
> > On Wednesday 16 January 2008 10:21:18 Kevin Kubasik wrote:
> > > OK, well the obvious agreement is a need for time/change tracking, I
> > > added a dbus signal called on inserts and a method to get all new
> > > triples since a specified timestamp. As for file monitoring, while a
> > > Gnome-wide service would be nice, I think that it is outside the scope
> > > of a metadata daemon (personally, open to more discussion on this).
> > >
> > > I think that a rudimentary triple store (roughly like what I have
> > > produced here) is a great _base_ for what we are all more or less
> > > talking about. I think that the pushes for more searching/indexing
> > > capabilities of the data here are missing the point, this is more a
> > > simple storage engine. Powerful desktop search engines like Beagle and
> > > Tracker can now both index the same stored metadata.
> >
> > IMHO the indexing should be part of the store. And the search engines
> > should then use the store to query the data. Thus, we would have these
> > components:
> >
> > * Indexer (or better: analyser)
> >   analyses files and writes the data into the store
> >
> > * Store
> >   Simple data store for triples (or quadruples) with a proper RDF API
> > (like Soprano fx ;) for advanced queries and a simpler API to perform
> > stuff like: - getAllProperties( uri resource )
> >   - setProperty( uri resource, uri property, value )
> >   and so forth which handle time stamping and meta-meta-data updating
> >   automatically.
> >   This store also indexes the data and provides a query API which can be
> > used by search engines. This query API is low level and not intended for
> > the end user (I would opt for SPARQL here but I think you know that ;)
> >
> > * Search client
> >   Creates queries to the store from user queries.
> >   (This is what has been described already in XesamQueryLanguage)
> >   "Final" search clients would then be using this service for queries.
> >   Thus, searching means three steps:
> >      user GUI -> search client service -> Store
> >
> > * File watch service
> >   Watches file systems for changes and updates the metadata accordingly.
> >
> > I think it is important to keep the data in one place here. There is no
> > reason to keep separate stores and indexes for data from file analysers
> > and from user input (like tags) or any other application that likes to
> > store something.
>
> agree totally (except for explicit exposure of rdf semantics/sparql)

This exposure is not forced upon users in the sense that only the users of the 
system who care about this, will use such an interface. So it's a non-issue 
form the user's POV.

Certainly it makes sense to have a casual coder-friendly interface. One of 
nice approaches is what is done in Soprano, but there can be others as well.

It's probably possible to use similar approach towards query construction.

I'd ask you to take a look at Soprano when you have time to see how raw 
rdf+sparql can be combined with concepts more familiar to coders without 
really losing much(if any) power.

It is in fact rdf+sparql, but modelled in terms of the programming language 
you are using to interface Soprano and to me it seems rather intuitive... 

Something similar to this:

Xesam::Document doc();
doc.title("Job Application");
doc.text("sdifghsdfughsdfg");

Xesam::Document inherits(possibly indirectly) rdf::Resource class and 
rdf::Resource has low-level functions to set/remove/query arbitrary triples, 
like getAllProperties() and what not. but you don't use it unless you really 
know what you are doing and can't do it using higher-level approach.

Basically you are working with objects of your programming langue with 
inheritance, inferencing etc working in an intuitive way.

> in tracker we store user/app defined metadata in a separate db but
> sqlite allows you to construct a vitrual database which amalgamates
> several sqlite db files to create a single virtual db. Where data is
> stored is an implementation detail but obviously one place (or virtual
> place) is more practical.
>
> Its a good idea to separate expendable metadata from the indexer and
> precious user/app defined ones to prevent any mishaps. Alternatively
> backing up and restoring precious data can also be used in addition to a
> primary store.
>
> Having everything in one physical place with no backup is probably a bit
> dangerous IMO

You're right. Makes sense to either keep them separate or have a synced backup 
of the important part of the data.

-- Evgeny