[XESAM] Ontology snapshot

Evgeny Egorochkin phreedom.stdin at gmail.com
Sun Jun 3 05:01:51 PDT 2007

On Sunday 03 June 2007 11:09:19 Fabrice Colin wrote:
> On 6/3/07, Evgeny Egorochkin <phreedom.stdin at gmail.com> wrote:
> > * video class is actually video+audio
> > * since it's impossible to properly inherit both image and audio
> > properties for audio+video, Image properties get inherited and the most
> > important audio properties get copied from audio class, but are not
> > directly related to their "prototypes".
> On first look, this doesn't seem so cumbersome.

It's not nice to the apps trying to work with audio stream reardless of its 

> > User Annotation:
> >
> > Need feedback on user annotation/user-provided metadata.
> > Everything I can come up with is:
> > *keywords
> > *title( file name can be considered title as well, since it's exactly
> > this: user's title for the content)
> > *description
> > *comment
> > *rating
> > *Usage intensity(possibly several fields like how many times opened,
> > total usage/edit/view time)
> >
> > Some of these are source-specific
> >
> > On top of this some apps have user annotation info stored in a separate
> > DB. i.e.  while file source is the file system, usage stats are stored
> > somewhere else.
> >
> > Another point to consider: there can be several users. Or can't?
> Yes, there can be, bu I wouldn't worry about this. Who can edit which field
> is better left to the application.

I meant that several users can assign different keywords/comments etc the same 
shared file. Either we discard other users' tags(which is likely) or we have 
to somehow account for them,

> > >*** Need feedback BADLY on sources and user annotation. Wake up people!
> Sorry, I don't understand why some of the above are source specific.

Fx email attachments don't have keywords or editing-related usage stats

> > Thumbnails:
> >
> > What to do with them?
> They would fall under Image. Isn't that good enough ?

I meant handling thumbnails of files. Strigi has a thumbnail field for files. 
Not sure about other indexers, not sure we need this in xesam. Ideas?
Might thumbnail retrieval be useful somehow?

> A few other comments follow :
> Aren't Message.primaryRecipient and Message.recipient actually the same
> thing ?

No. Message.recipient lists all recipients(primary and secondary) i.e. 
to+cc+bcc or IRC chat participants.

> Is Software for packages ? If so, I would prefer the name SoftwarePackage.
> Requires would be useful as property.

Initially it was intended to be used for source code/scripts as well. btw 
libraries(compiled) have dependencies too.
For dependencies there's already Content.depends property.

> There's a typo in the comment of Source.md5Hash.


> I haven't followed the conversation very closely, so please excuse me if
> this has been answered already : if I understand correctly, a File can't
> also be a Document ?

DataObject can be both File and Document at once and often will be.

Document is a content type i.e. the sequence of bytes we are analyzing.

File is a source type. That is the document is stored in the file system and 
is a file, as opposed to say being stored in an archive and being an archive 
item or being stored in an email body as an attachment.

We need to decide whether to name sources by container names( e.g. filesystem 
for files, email for attachments, mailbox for emails, archive for archive 
or name sources by content names e.g. file, attachment,archiveItem, etc.


More information about the xdg mailing list