[XESAM] New meeting, date+time proposals please

Mon May 21 02:42:25 PDT 2007

On Sunday 20 May 2007 16:38:00 Evgeny Egorochkin wrote:
> On Sunday 20 May 2007 14:05:14 Mikkel Kamstrup Erlandsen wrote:
> > 2007/5/19, Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> > > On Saturday 19 May 2007 14:37:57 you wrote:
> > > > 2007/5/18, Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> > > > > On Friday 18 May 2007 12:27:30 Mikkel Kamstrup Erlandsen wrote:
> > > > > > > >  * should we allow for multiple inheritance (ie multiple
> > > > > > > > parents for fields)?
> > > > > > >
> > > > > > > I believe there were two issues intermixed: multiple parents
> > > > > > > for
> > > > >
> > > > > fields
> > > > >
> > > > > > > and
> > > > > > > multiple types or as you say categories for files.
> > > > > >
> > > > > > True. That is two issues, but I got the impression that the
> > > > >
> > > > > Strigi/Nepomuk
> > > > >
> > > > > > camp where in favor of both?
> > > > > >
> > > > > > As I consider multiple inheritance (both cats and/or fields) to
> > > > > > be a somewhat big feature request it needs to be founded on solid
> > >
> > > reasoning
> > >
> > > > > if
> > > > >
> > > > > > we should go with it.
> > > > >
> > > > > I don't consider multiple file types/categories a big feature.
> > > > > Suppose
> > >
> > > a
> > >
> > > > > file
> > > > > has type/category Audio. This means it belongs to the following
> > > > > categories:
> > > > > File, Media, Audio. So it already has multiple types. The question
> > > > > is whether
> > > > > we allow these types to be outside of strict hierarchy.
> > > >
> > > > It all boils down to whether or not we allow cycles in the ontology
> > >
> > > tree.
> > >
> > > > It is a lot easier to parse/update a tree structure if there are no
> > >
> > > cycles.
> > >
> > > > That is why I consider it a big feature.
> > >
> > > Cycles? Not sure what you are talking about. We should not allow any
> > > type to
> > > be a parent of itself(indirectly), true, but this is possible even in
> > > single
> > > ineritance case if onto is malformed. Or maybe you should elaborate
> > > more?
> >
> > In my terminology this is a cycle:
> >
> > A   : parent = None
> > B1 : parent = A
> > B2 : parent = A
> > C   : parent = B1, B2
>
> Considering that all categories are likely to be derived from a single
> generic one, yes this is a typical case. And what exactly bothers you?
> Either way, you have a list of categories any particular file belongs to...
>
> > > Multiple field inheritance, is too in my opinion is not a big feature
> > >
> > > > > request
> > > > > if inheritance is implemented as such. It might be useful if we
> > > > > link multiple
> > > > > external ontologies. If we stick with a relatively simple core
> > >
> > > ontology,
> > >
> > > > > it
> > > > > may not be required. Time will tell.
> > > >
> > > > "We" in this context is *only* the nepomuk project mind you (correct
> > > > me
> > >
> > > if
> > >
> > > > I'm wrong please). There are no plans what so ever for integrating
> > > > with general ontologies in xesam. You can extend the xesam ontology
> > > > with
> > >
> > > other
> > >
> > > > xesam-compliant ontologies and that's it.
> > >
> > > Sorry? I was under impression that Tracker and Beagle wanted to reuse
> > > existing
> > > ontologies as much as possible? So I proposed to make a core
> > > xesam-specific
> > > ontology with mappings to DC, EXIF etc, since it's impossible to
> > > cleanly link
> >
> > I think we agree here :-) I might have been unclear. What I meant is that
> > xesam should not necessarily be interoperable with any old ontology off
> > the web. Priority ones like EXIF and such is another case that I do think
> > we should expose/consume/embed/extend (/me don't consider DC an
> > ontology). We should be interoperable with desktop-related widespread
> > standards IMHO, but these should be nailed down before hand.
> >
> > > Xesam should of course not restrict Nepomuk from doing this.
> > >
> > > You're under wrong impression that I'm lobbying nepomuk-specific
> > > features to
> > > make life for nepomuk easier. In fact, the simpler is xesam onto(no
> > > matter how badly screwed it is), the easier it will be for nepomuk to
> > > map it. The reason is that the only mapping needed is Nepomuk->Xesam
> > > and not vice versa.
> > > So Nepomuk doesn't have to decipher and work around any Xesam onto
> > > simplifications/deficiencies(as compared to Nepomuk).
> >
> > Oh, I was not aware of what the Nepomuk needs actually where, but if they
> > only need a map Nep->Xes then our life is easier :-)
>
> I can't really speak for nepomuk, this is my personal POV.

What I think is needed is a mapping that allows nepomuk to reuse xesam data. 
Thus, we need a mapping from the xesam onto to the nepomuk ones, i.e. all 
xesam fields should be mapped to a specific nepomuk field. This allows some 
freedom regarding xesam but the closer the xesam onto is to the nepomuk style 
(inheritance and types and stuff) the easier is the mapping and the more data 
can be reused. Even more so, if the mapping would work both ways (at least 
partially) xesam-aware apps could benefit from nepomuk-only data and thus, 
get better search results.

> Xesam will be one of interfaces to Nepomuk functionality. The more of
> Nepomuk functionality is exposed via Xesam, the better. Just like any
> indexer app.
>
> It is even more important for indexers, since some of them would prefer
> xesam to be the base of their app, and not just one of interfaces. And this
> is the reason, I'm trying to make xesam flexible and extensible, as not to
> intentionally cripple their ability to implement additional functionality.
>
> Looking at the progress of xesam and the direction where it is headed, I
> have my reasonable doubts that xesam-aware apps will be able to contribute
> in the reverse direction. At least not with .desktop ontology format,
> over-simplification of ontology etc. The hacks to make it work in the
> reverse direction are likely to outweigh the effort to interface directly.
>
> > Actually, the easiest thing would be to claim that DC is the best and
> >
> > > all-encompassing onto and we don't need anything else since Nepomuk
> > > already
> > > has a DC mapping.
> >
> > I don't think anybody wants this :-)
> >
> > > > Unfortunately we didn't really get to discuss any
> > > >
> > > > > > practical use cases in the IRC meeting.
> > > > > >
> > > > > > I have not been able to come up with a good use case (of multi
> > > > > > inh.) myself, but maybe some one here can?
> > > > >
> > > > > Source code: It is a text document(contains text) and software(has
> > > > > dependencies on other software).
> > > >
> > > > You mean that it might ref some .h files fx? If that is what you
> > > > meant I can't see why a simple subclass SourceCode->TextFile (or
> > > > something)
> > >
> > > isn't
> > >
> > > > enough..?
> > >
> > > Software has dependencies, maintainer, project it belongs to.
> > >
> > > All multiple-inheritance issues can be resolved by moving offending
> > > fields higher in the hierarchy. This doesn't hurt because they all are
> > > optional. Also, you can eliminate single inheritance and file types as
> > > such, without much fuss.
> > >
> > > The problem with this approach is that software no longer knows which
> > > type is
> > > particular file and consequentially what fields to expect etc.
> > >
> > > The advantage of multi- vs single- inheritance is that you describe
> > > aspects of
> > > a file with types, e.g it's a text, software and network resource.
> > > Software
> > > then knows what fields to expect and what it is processing.
> >
> > I think you are confusing the matters here. One thing is if a category
> > can have multiple parents in the spec. Another thing is if a specific
> > file can belong to several categories...
>
> Files definitely should be allowed to belong to several categories. However
> in this case SourceCode file type/category might have additional specific
> fields absent in both Text and Software types/categories e.g.
> SourceCode:stats.commentCount
>
> > If the onto is quite generic, multiple-inheritance may not be needed. I
> >
> > > don't
> > > insist that we must use it. My point is that it's easy to implement and
> > > it may be useful. Whether/when it will be useful, time will tell.
> >
> > Ok. I find it hard to get a clear view of the pros and cons on this with
> > only the two of us arguing. My biggest problem is that I'm not clear on
> > the implementation burden of a multi-inheritance system. Both
> > ontology-parsing and the actual searching is affected by multi-inh and I
> > don't know how well all backends Lucene, Xapian, Trackers custom SQLLite
> > based, handle this... Maybe a few words by the experts can shed some
> > light on the matter; Jos, Joe, Jamie?
>
> As to category and field multiple-inheritance, I believe the hardest part
> is making inheritance as such work. AFAIK most if not all of the mentioned
> backends don't support inheritance as such. So to them either inheritance
> mechanism looks like a list of field/category names to check against.
>
> I see no point to deny multi-inheritance issue from the get go. Still, it
> is hard for me to figure out whether we'll need it badly in the future or
> not. I can't know for sure how the opinion of participants will influence
> the ontology.
>
> --Evgeny
>
> _______________________________________________
> xdg mailing list
> xdg at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/xdg