[XESAM] New meeting, date+time proposals please

Evgeny Egorochkin phreedom.stdin at gmail.com
Sun May 20 07:38:00 PDT 2007


On Sunday 20 May 2007 14:05:14 Mikkel Kamstrup Erlandsen wrote:
> 2007/5/19, Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> > On Saturday 19 May 2007 14:37:57 you wrote:
> > > 2007/5/18, Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> > > > On Friday 18 May 2007 12:27:30 Mikkel Kamstrup Erlandsen wrote:
> > > > > > >  * should we allow for multiple inheritance (ie multiple
> > > > > > > parents for fields)?
> > > > > >
> > > > > > I believe there were two issues intermixed: multiple parents for
> > > >
> > > > fields
> > > >
> > > > > > and
> > > > > > multiple types or as you say categories for files.
> > > > >
> > > > > True. That is two issues, but I got the impression that the
> > > >
> > > > Strigi/Nepomuk
> > > >
> > > > > camp where in favor of both?
> > > > >
> > > > > As I consider multiple inheritance (both cats and/or fields) to be
> > > > > a somewhat big feature request it needs to be founded on solid
> >
> > reasoning
> >
> > > > if
> > > >
> > > > > we should go with it.
> > > >
> > > > I don't consider multiple file types/categories a big feature.
> > > > Suppose
> >
> > a
> >
> > > > file
> > > > has type/category Audio. This means it belongs to the following
> > > > categories:
> > > > File, Media, Audio. So it already has multiple types. The question is
> > > > whether
> > > > we allow these types to be outside of strict hierarchy.
> > >
> > > It all boils down to whether or not we allow cycles in the ontology
> >
> > tree.
> >
> > > It is a lot easier to parse/update a tree structure if there are no
> >
> > cycles.
> >
> > > That is why I consider it a big feature.
> >
> > Cycles? Not sure what you are talking about. We should not allow any type
> > to
> > be a parent of itself(indirectly), true, but this is possible even in
> > single
> > ineritance case if onto is malformed. Or maybe you should elaborate more?
>
> In my terminology this is a cycle:
>
> A   : parent = None
> B1 : parent = A
> B2 : parent = A
> C   : parent = B1, B2

Considering that all categories are likely to be derived from a single generic 
one, yes this is a typical case. And what exactly bothers you? Either way, 
you have a list of categories any particular file belongs to...

> > Multiple field inheritance, is too in my opinion is not a big feature
> >
> > > > request
> > > > if inheritance is implemented as such. It might be useful if we link
> > > > multiple
> > > > external ontologies. If we stick with a relatively simple core
> >
> > ontology,
> >
> > > > it
> > > > may not be required. Time will tell.
> > >
> > > "We" in this context is *only* the nepomuk project mind you (correct me
> >
> > if
> >
> > > I'm wrong please). There are no plans what so ever for integrating with
> > > general ontologies in xesam. You can extend the xesam ontology with
> >
> > other
> >
> > > xesam-compliant ontologies and that's it.
> >
> > Sorry? I was under impression that Tracker and Beagle wanted to reuse
> > existing
> > ontologies as much as possible? So I proposed to make a core
> > xesam-specific
> > ontology with mappings to DC, EXIF etc, since it's impossible to cleanly
> > link
>
> I think we agree here :-) I might have been unclear. What I meant is that
> xesam should not necessarily be interoperable with any old ontology off the
> web. Priority ones like EXIF and such is another case that I do think we
> should expose/consume/embed/extend (/me don't consider DC an ontology). We
> should be interoperable with desktop-related widespread standards IMHO, but
> these should be nailed down before hand.
>
> > Xesam should of course not restrict Nepomuk from doing this.
> >
> > You're under wrong impression that I'm lobbying nepomuk-specific features
> > to
> > make life for nepomuk easier. In fact, the simpler is xesam onto(no
> > matter how badly screwed it is), the easier it will be for nepomuk to map
> > it. The reason is that the only mapping needed is Nepomuk->Xesam and not
> > vice versa.
> > So Nepomuk doesn't have to decipher and work around any Xesam onto
> > simplifications/deficiencies(as compared to Nepomuk).
>
> Oh, I was not aware of what the Nepomuk needs actually where, but if they
> only need a map Nep->Xes then our life is easier :-)

I can't really speak for nepomuk, this is my personal POV.

Xesam will be one of interfaces to Nepomuk functionality. The more of Nepomuk 
functionality is exposed via Xesam, the better. Just like any indexer app.

It is even more important for indexers, since some of them would prefer xesam 
to be the base of their app, and not just one of interfaces. And this is the 
reason, I'm trying to make xesam flexible and extensible, as not to 
intentionally cripple their ability to implement additional functionality.

Looking at the progress of xesam and the direction where it is headed, I have 
my reasonable doubts that xesam-aware apps will be able to contribute in the 
reverse direction. At least not with .desktop ontology format, 
over-simplification of ontology etc. The hacks to make it work in the reverse 
direction are likely to outweigh the effort to interface directly.

> Actually, the easiest thing would be to claim that DC is the best and
>
> > all-encompassing onto and we don't need anything else since Nepomuk
> > already
> > has a DC mapping.
>
> I don't think anybody wants this :-)
>
> > > Unfortunately we didn't really get to discuss any
> > >
> > > > > practical use cases in the IRC meeting.
> > > > >
> > > > > I have not been able to come up with a good use case (of multi
> > > > > inh.) myself, but maybe some one here can?
> > > >
> > > > Source code: It is a text document(contains text) and software(has
> > > > dependencies on other software).
> > >
> > > You mean that it might ref some .h files fx? If that is what you meant
> > > I can't see why a simple subclass SourceCode->TextFile (or something)
> >
> > isn't
> >
> > > enough..?
> >
> > Software has dependencies, maintainer, project it belongs to.
> >
> > All multiple-inheritance issues can be resolved by moving offending
> > fields higher in the hierarchy. This doesn't hurt because they all are
> > optional. Also, you can eliminate single inheritance and file types as
> > such, without much fuss.
> >
> > The problem with this approach is that software no longer knows which
> > type is
> > particular file and consequentially what fields to expect etc.
> >
> > The advantage of multi- vs single- inheritance is that you describe
> > aspects of
> > a file with types, e.g it's a text, software and network resource.
> > Software
> > then knows what fields to expect and what it is processing.
>
> I think you are confusing the matters here. One thing is if a category can
> have multiple parents in the spec. Another thing is if a specific file can
> belong to several categories...

Files definitely should be allowed to belong to several categories. However in 
this case SourceCode file type/category might have additional specific fields 
absent in both Text and Software types/categories e.g. 
SourceCode:stats.commentCount

> If the onto is quite generic, multiple-inheritance may not be needed. I
>
> > don't
> > insist that we must use it. My point is that it's easy to implement and
> > it may be useful. Whether/when it will be useful, time will tell.
>
> Ok. I find it hard to get a clear view of the pros and cons on this with
> only the two of us arguing. My biggest problem is that I'm not clear on the
> implementation burden of a multi-inheritance system. Both ontology-parsing
> and the actual searching is affected by multi-inh and I don't know how well
> all backends Lucene, Xapian, Trackers custom SQLLite based, handle this...
> Maybe a few words by the experts can shed some light on the matter; Jos,
> Joe, Jamie?

As to category and field multiple-inheritance, I believe the hardest part is 
making inheritance as such work. AFAIK most if not all of the mentioned 
backends don't support inheritance as such. So to them either inheritance 
mechanism looks like a list of field/category names to check against.

I see no point to deny multi-inheritance issue from the get go. Still, it is 
hard for me to figure out whether we'll need it badly in the future or not. I 
can't know for sure how the opinion of participants will influence the 
ontology.

--Evgeny



More information about the xdg mailing list