[XESAM] Ontology snapshot
Evgeny Egorochkin
phreedom.stdin at gmail.com
Wed Jun 6 08:47:03 PDT 2007
On Wednesday 06 June 2007 17:54:13 jamie wrote:
> On Wed, 2007-06-06 at 16:37 +0200, Mikkel Kamstrup Erlandsen wrote:
> > I've been bugging about trying to figure out how we can please
> > everyone with regards to categories and sources.
> >
> > There seem to be consensus on the following: Each object has two
> > designated *single valued* fields Category and Source. These two
> > fields imply what other fields makes sense on the object (as implied
> > by the purple arrows in Evgenys diagram).
> >
> > Important: There is a trade off made here. We basically have two
> > choices to avoid a lot of duplication/ambiguities in the onto: Either
> > we allow multiple inheritance (on categories is all that is needed) or
> > we have multiple values for the category field. I talked this over
> > with Evgeny and we ended up with the multiple-inheritance for cats.
> > The example here could be that a SourceCode cat derives from both
> > TextDocument and Software.
>
> such a scheme screws up our search results by category in tracker
>
> we have search by cat for Development Files and Text Files but we do not
> show Dev files under Text Files. Having a deep hierarchy will also cause
> lots of dupes in search results for different cats
>
> For practical reasons I prefer it as flat as possible
>
> Current tracker onto for File based cats is:
>
> All Files
> -> Music
> -> Documents
> -> Text
> -> Videos
> -> Images
> -> Development
> -> Folders
>
> As you can see there is no need for more than one level deep inheritance
> and absolutely no need for MI. Even if you put Dev files under Text,
> Text still inherits from All Files so a need for MI is not necessary.
>
> Text in tracker does not show Docs or Dev files (even if they are text
> based) as they have their own cat. I really dont like duplicating
> results in different cats
Actually there's a need and many real use cases. This is a specific of your
approach, but there's no problem with that.
It is possible to provide a category that would act like you describe for text
files, but not sourcecode files, especially so with MI for categories(for
complex cases).
That is we can have a TextFile with children SourceCode, TextDocument and
Text(for non sourcecode/document). All text-related properties like line
count belong to TextFile. This unifies both approaches and on the surface
seems better since there's in fact a distinction between a plain-text file
and a plain-text document file, though you can't 100% discern this at the
software level but you can try.
As to "flatness" of your approach, it's not flat per se.
If you try to formally represent your approach with an ontology, that is
provide a consistent and machine-"understandable" description of the rules
like which files get assigned/imply which properties, how properties are
related, how files are spread across categories, you will end up with many
abstract Categories with MI.
And that's exactly what I'm doing. A formal non-ambiguous and
machine-"understandable" description.
This "flattness" is only possible if you give a generic description of the
ontology to humans and rely on them figuring out the rest using their
knowledge of file formats, metadata, common-sense, looking at your source
code etc and making their software understand the implications.
--Evgeny
More information about the xdg
mailing list