2007/6/10, Evgeny Egorochkin <<a href="mailto:phreedom.stdin@gmail.com">phreedom.stdin@gmail.com</a>>:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Sunday 10 June 2007 12:20:48 Mikkel Kamstrup Erlandsen wrote:<br>> 2007/6/9, Evgeny Egorochkin <<a href="mailto:phreedom.stdin@gmail.com">phreedom.stdin@gmail.com</a>>:<br>> > Source attached. Cute picture:
<br>> ><br>> > <a href="http://www.freedesktop.org/wiki/PhreedomDraft?action=AttachFile&do=view&t">http://www.freedesktop.org/wiki/PhreedomDraft?action=AttachFile&do=view&t</a><br>> >arget=
viz.png<br>><br>> Great work. This is really starting to look like something.<br>><br>> --------------<br>><br>> > Design decisions proposed:<br>> ><br>> > Split ontology into Xesam Core, Xesam Convenience and Xesam Mappings
<br>> ><br>> > Xesam Core expresses the full semantics of the ontology i.e. it is<br>> > self-sufficient and describes all the useful information we plan<br>> > indexing.<br>> ><br>> > Xesam Convenience contains semantically irrelevant fields which are
<br>> > subchildren of Xesam Core fields and provide nothing except more<br>> > human-friendly names and descriptions.<br>> ><br>> > Xesam Mappings provides mapping for external standards like EXIF and
<br>> > vCard.<br>> > For each such standard a base set of fields and categories capturing the<br>> > most<br>> > relevant features of the standard is provided in Xesam Core.<br>> > The full standard or a more complete implementation is provided via Xesam
<br>> > Mappings.<br>> > The reason: excessive complexity or multiple irrelevant features of the<br>> > standard.<br>> ><br>> > Xesam Core is the primary goal for now. The rest will follow as the need
<br>> > arises/time allows<br>><br>> Ok, I think this (onto split proposal) is a good idea to avoid *trying* to<br>> create the all-encompassing onto in the first take.<br>><br>> My only gripe is that I don't like the word "Convenience", how about
<br>> "Extended" instead?<br>><br>> So +1 from me if we call Xesam Convenience Xesam Extended instead :-)<br><br>I called it convenience because it's "semantically irrelevant" that is you can
<br>do everything with Xesam core, and xesam convenience is nothing more than<br>novice-understandable mapping of Xesam core.</blockquote><div><br><br>Ok, let's just call it "convenience" for now. The exact wording is not central at this point.
<br><br>Is there anything from your current draft that will be punted to convenience or mappings?<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> VCard compatibility:<br>> > It is not feasible to implement the full vcard functionality. The<br>> > following<br>> > simplifications are made:<br>> > * name is a single field<br>><br>> I guess we can split the name up in subfields (forname, surname,
<br>> middlename) in Xesam Extended or something..?<br><br>Maybe Xesam Mappings is the right place for this. We capture the most<br>important features(name in this case) and full vCard is implemented in Xesam<br>Mappings.
<br><br>> * postal addresses are single fields<br>><br>><br>> This makes sense. It would take a lot of fields to model a<br>> nationality-neutral postal address scheme.<br>><br>> * some obscure features are dropped like modem phone number for the sake of
<br>><br>> > simplicity<br>><br>> Good<br><br>> New design limitations:<br>> > 1) Source and Content hierarchies are kept separate, that is no Class can<br>> > inherit source and content at once
<br>> > 2) Each file is assigned at max one content and one source.<br>><br>> Good - as we all agreed on :-)<br>><br>> ------------<br>><br>> > Issues:<br>> ><br>> > Maybe we need a better name for MailboxItem and ArchiveItem?
<br>><br>> I think we should scrap the Item part of those words. This cat name is not<br>> describing what the object *is* but what the object comes from. With the<br>> Item postfixes it sounds like the object with Source=ArchiveItem comes from
<br>> an item withing an archive (fx a jpg in a pdf in a zip).<br><br>We need to agree on a consistent Source naming.<br>Source-Source Item examples:<br>Filesystem -File<br>Archive -ArchiveItem<br>Email -Attachment
<br><br>It seems resonable to adopt either:<br>* this is contained in a [Filesystem,Archive,Email]<br>* this is a [file, archiveitem, attachment]<br><br>But not the both at the same time.</blockquote><div><br>Right. This is tricky. I really think the "this comes from"-metaphor is the closes to the intention. The "this is a"-metaphor is already what categories imply.
<br><br>Because of this I also think that Mailbox is a better source name than Email. The Attachment is more subtle because in some way it does make sense to say that "holiday1.jpg comes from an attachment", I can easily imagine several arguments against this metaphor but it is really not a clear cut case.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> Still not decided on how to PIM stuff.<br>><br>><br>> Could we rename Todo to Task instead then? Sounds less nerdy :-)
<br><br>This is the first time in my life someone calls Todo nerdy. </blockquote><div><br>About time someone broke it to you then :-) "Todo" *is* a geek term - atleast my wife never used it before she met me :-)
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>> Fields on the Task cat could be Summary, Priority, DueDate. Stuff like a
<br>> Summary and Description can be derived from fields in the Content cat.<br><br>There's a good reference: iCalendar. Have to strip many fields to make it<br>usable though.<br><br>The problem with these PIM things is like this: We have 6 fields and 5 PIM
<br>classes. Each ones uses 5 fields out of 6, and each one uses a different set.</blockquote><div><br>Eeek. Good that I'm not the ontology maintainer ;-P <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> Need to revamp media ontology.<br>><br>> > Can we count on backends being able to figure out list lengths? i.e. if<br>> > we have Software.depends relation, do we need Software.dependCount? I<br>> > think no.
<br>> > Either we have a *count property for things we don't describe, or we have<br>> > a<br>> > list of things and no *count property.<br>><br>> Hmmm... This is a tricky case. The query language cannot handle this atm. -
<br>> Ie searching for all SourceCode items with more than 10 depencies fx -<br>> unless the number of deps is explicitely stored in a field.<br><br>Still potentially every list field asks for an item count companion.
</blockquote><div><br><br>We will need some feedback from the various projects on this. I'm not sure it is even possible to query the length of list if you intend to keep a decent performance. But that is probably very implementation specific.
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> Should we elaborate comment stats for SourceCode along the way of text
<br>> stats<br>><br>> > or commentCharacterCount is sufficient?<br>><br>> It is sufficient for the only use case I can come up with. Finding<br>> under-documented stuff. I think we should keep it at this.
<br><br>You should consider that Xesam or rather Xesam indexers will also double as a<br>meta-data extraction tool possibly via other APIs.<br><br>For me comments are useful to find user-documented stuff and evaluate just how
<br>much documentation there is. commentCharCount seems to be sufficient for<br>that.</blockquote><div><br><br>Ok, let's keep it at that for now then. <br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> Questions:<br>><br>> Afaik PDFs (and other office docs) can be password protected. Perhaps<br>> isPassWordPretected should be moved to contents?<br><br>These are two different things. In case of ArchiveItem password protection is
<br>external to the file, provided by archiver. In case of documents, the<br>password protection is internal.<br><br>ATM it seems like a good idea to implement it the same way as with keywords.<br>Will think more about it of course.
</blockquote><div><br><br>I'm affarid I can't see the probelm here. There might be different implementations behind the different password protection mechanisms, but all that we are interested in is whether or not the file is protected.
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> Is there any general field that names the origin of a file? Fx a the url of
<br>> a downloaded file?<br><br>I see you are trying to integrate here one of FDO recommendations. Seems like<br>a good idea especially if others use this extended attribute as well.<br><br>> You've added MediaList and AudioList. Would it not make sense to have a
<br>> generic List object? Fx a series of images or documents might form a<br>> slideshow. Perhaps a better metaphor would be Collection. Fx. most IDEs has<br>> a project-metaphor where a bunch of files is a part of a project. With the
<br>> collection metaphor we could model this.<br>><br>> Now I'm at it - why not a Project cat? It could be a subcat of my suggested<br>> Collection cat. Projects have names, versions, etc... I have several
<br>> programs that install project files.<br><br>Basically Content already implements collection and container functionality.</blockquote><div><br><br>You mean via the content.contains, links, depends fields? It might still be useful with some cats for this though - as I assume Content will be an abstract cat.
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">So the only thing we need to do is to add a tree of collections since<br>different collections imply different content types of things they link to.
<br>Can't think of any specific properties for most collections. Project can be<br>quite different though.<br><br>I added these sample collections to have people scream "so little! so limited!<br>we need, no we demand more!" and actually provide a useful list :)
</blockquote><div><br>Ok, I really don't think we shoud forget about a Project category though.<br></div><br><br>Cheers,<br>Mikkel<br></div>