[Xesam] Details of tag handling

Jamie McCracken jamie.mccrack at googlemail.com
Tue Aug 26 16:36:40 PDT 2008


On Tue, 2008-08-26 at 21:49 +0200, Mikkel Kamstrup Erlandsen wrote:
> 2008/7/2 Sebastian Trüg <strueg at mandriva.com>:
> > On Wednesday 02 July 2008 12:16:57 Mikkel Kamstrup Erlandsen wrote:
> >> 2008/7/2 Sebastian Trüg <strueg at mandriva.com>:
> >> > While this is easy to query it is not very clean semantically speaking
> >> > since you link tag objects and tagged resources via string literals. In
> >> > Nepomuk we use tag objects and relate tagged resources to the URI of this
> >> > object.
> >> >
> >> > - Querying objects on their tags is also simple: just get all resources
> >> > that are related to the tag via the tagging property.
> >> > - Getting a list of all tags is as easy as before: list all resources of
> >> > type xesam:Tag (or nao:Tag in Nepomuk)
> >> > - Querying all files without a tag is also simple (in sparql we use a
> >> > filter) - listing all files with their associated tag's names is also not
> >> > a big deal (a sparql query can easily be extended or one fetches tags and
> >> > their labels for each file separately)
> >> > - renaming of tags is much simpler than your method: only change one!
> >> >  property: the tag's label (in your version all tagged files have to be
> >> >  updated.
> >> > - deleting a tag is as simple as deleting the tag resource and all
> >> > triples referencing it.
> >> > - having tags without files is trivial: tag resources can live on their
> >> > own anyway.
> >> >
> >> > Hope this helps a bit. Keep in mind that with tag resources we leave the
> >> > world of two-dimensional metadata which can be handled by databases like
> >> > lucene.
> >>
> >> That is the problem. However for tags specifically it may not be a
> >> problem. If we can assume that users don't carry around tonnes of tags
> >> at least. The application can first fetch a list of all Tag objects
> >> and create a tag_name<->uri map.Then search for items with the
> >> particular uri when the user requests some tagged items. As said this
> >> does not scale very well.
> >>
> >> This could be remedied by having a convention on how to construct tag
> >> uris. For example tag://<tag_name> or something more elaborate.
> >>
> >> So; bottom line - I am not as such opposed to using proper uri
> >> relations for tags. We just need to agree on hot to do it.
> >
> > The way I see it, the problem is: do we introduce triples into Xesam at this
> > level or not, right?
> > Because with triples it is easy (as already done in Nepomuk). But without
> > plain strings are way simpler. The question is: do we want some hybrid
> > solution which may be an overhead for both worlds.
> 
> (This thread is item 6 on http://xesam.org/main/XesamUpdates)
> 
> The more I think about this, the more I think the answer is "yes". I
> do not object to the term "hybrid", but I actually still think the
> approach is fairly clean (considering that Xesam utilizes a field
> based data model).
> 
> So to rehash:
> 
>  * xesam:userKeywords stores a list of _opaque_ tag:// uris
>  * Applications wanting to query tags first have to resolve tag-name
> -> tag-uri however it prefer (on-the-fly or precomputed map)
>  * This gives incredible flexibility compared to a flat list of tag
> names stuffed in xesam:userKeywords. Flexibility both for the server
> and client
> 
> That's my opinion anyway. Do chime in.

I would keep it simple as list of strings

nepomuk could then associate a tag name with a tag object and get
additional properties. I dont see why tag://mytag should not be stored
as mytag? The problem for us is we would have to special case tags for
indexing purposes and remove the uri. So its more pain and no gain for
all the non-nepomuk implementations 

tracker would use tag name to look into an emblems folder for things
like icons so it would not store anything other than tag name

A lot of these *extras* are making it considerably more difficult and
more time consuming to implement basic xesam stuff. The barrier of entry
to xesam 1.0 should be kept as low as possible IMO

jamie



More information about the Xesam mailing list