[Xesam] Ontology snapshot

Mikkel Kamstrup Erlandsen mikkel.kamstrup at gmail.com
Sun Jun 10 02:20:48 PDT 2007


2007/6/9, Evgeny Egorochkin <phreedom.stdin at gmail.com>:
>
> Source attached. Cute picture:
>
> http://www.freedesktop.org/wiki/PhreedomDraft?action=AttachFile&do=view&target=viz.png



Great work. This is really starting to look like something.

--------------
> Design decisions proposed:
>
> Split ontology into Xesam Core, Xesam Convenience and Xesam Mappings
>
> Xesam Core expresses the full semantics of the ontology i.e. it is
> self-sufficient and describes all the useful information we plan indexing.
>
> Xesam Convenience contains semantically irrelevant fields which are
> subchildren of Xesam Core fields and provide nothing except more
> human-friendly names and descriptions.
>
> Xesam Mappings provides mapping for external standards like EXIF and
> vCard.
> For each such standard a base set of fields and categories capturing the
> most
> relevant features of the standard is provided in Xesam Core.
> The full standard or a more complete implementation is provided via Xesam
> Mappings.
> The reason: excessive complexity or multiple irrelevant features of the
> standard.
>
> Xesam Core is the primary goal for now. The rest will follow as the need
> arises/time allows


Ok, I think this (onto split proposal) is a good idea to avoid *trying* to
create the all-encompassing onto in the first take.

My only gripe is that I don't like the word "Convenience", how about
"Extended" instead?

So +1 from me if we call Xesam Convenience Xesam Extended instead :-)


VCard compatibility:
> It is not feasible to implement the full vcard functionality. The
> following
> simplifications are made:
> * name is a single field


I guess we can split the name up in subfields (forname, surname, middlename)
in Xesam Extended or something..?

* postal addresses are single fields


This makes sense. It would take a lot of fields to model a
nationality-neutral postal address scheme.

* some obscure features are dropped like modem phone number for the sake of
> simplicity


Good

New design limitations:
> 1) Source and Content hierarchies are kept separate, that is no Class can
> inherit source and content at once
> 2) Each file is assigned at max one content and one source.


Good - as we all agreed on :-)

------------
> Issues:
>
> Maybe we need a better name for MailboxItem and ArchiveItem?


I think we should scrap the Item part of those words. This cat name is not
describing what the object *is* but what the object comes from. With the
Item postfixes it sounds like the object with Source=ArchiveItem comes from
an item withing an archive (fx a jpg in a pdf in a zip).

Still not decided on how to PIM stuff.


Could we rename Todo to Task instead then? Sounds less nerdy :-)

Fields on the Task cat could be Summary, Priority, DueDate. Stuff like a
Summary and Description can be derived from fields in the Content cat.

Need to revamp media ontology.
>
> Can we count on backends being able to figure out list lengths? i.e. if we
> have Software.depends relation, do we need Software.dependCount? I think
> no.
> Either we have a *count property for things we don't describe, or we have
> a
> list of things and no *count property.


Hmmm... This is a tricky case. The query language cannot handle this atm. -
Ie searching for all SourceCode items with more than 10 depencies fx -
unless the number of deps is explicitely stored in a field.

Should we elaborate comment stats for SourceCode along the way of text stats
> or commentCharacterCount is sufficient?


It is sufficient for the only use case I can come up with. Finding
under-documented stuff. I think we should keep it at this.

Questions:

Afaik PDFs (and other office docs) can be password protected. Perhaps
isPassWordPretected should be moved to contents?

Is there any general field that names the origin of a file? Fx a the url of
a downloaded file?

You've added MediaList and AudioList. Would it not make sense to have a
generic List object? Fx a series of images or documents might form a
slideshow. Perhaps a better metaphor would be Collection. Fx. most IDEs has
a project-metaphor where a bunch of files is a part of a project. With the
collection metaphor we could model this.

Now I'm at it - why not a Project cat? It could be a subcat of my suggested
Collection cat. Projects have names, versions, etc... I have several
programs that install project files.

Does Audio not have any fields or is it just trimmed down for display
purposes?

Cheers,
Mikkel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/xdg/attachments/20070610/19cf7da7/attachment.html 


More information about the xdg mailing list