[Xesam] abstract properties?
phreedom.stdin at gmail.com
Wed Feb 13 17:15:04 PST 2008
This is in response to the lengthy discussion on #xesam that happened while I
>(22:35:51) kamstrup: in other words a field is abstract if and only if it
>(22:36:17) jamiemcc: yes and is not used in searches
>(22:36:30) kamstrup: also meaning that third parties can not extend fields
which does not have any children in the Xesam onto
>(22:36:45) kamstrup: moreover I also think we agreed that you can not
assign any value to an abstract field
>(22:36:54) kamstrup: (maybe obvious)
>(22:36:57) jamiemcc: yes
>(22:37:12) kamstrup: good, I think we agree then
>(22:37:15) jamiemcc: abstract are like intermediate classes
>(22:37:22) kamstrup: yes
>(22:37:31) jamiemcc: they ar enot used directly but instead are always
>(22:37:36) kamstrup: only leaf nodes of the onto can contain values
The benefits of this approach:
>(22:54:46) kamstrup: and having this as a restriction in Xesam does not
>render us incompatible with Nepo
This renders xesam incompatible with most if not any rdfs based approaches.
xesam->rdfs_derivative mapping is ok but it breaks in the opposite direction.
>(23:37:07) kamstrup: but it is even more likely that there are two good
>(23:37:44) kamstrup: my primary arg is simplicity
Certainly not the simplicity of the ontology and not the simplicity(and
feasibility) of onto extensions.
>(23:38:03) kamstrup: query expansion will also be easier
So your argument is that having to expand (grandparent=value) into
((grandparent=value) or (parent=value) or (property=value)) if we get rid
of "abstract" properties somehow leads to too much complexity?
Query expansion will necessitate property tree traversal in both cases. The
only difference is what fields are included in the resulting list: all or
The DB engine will discard those grandparent and parent criterias as soon as
it sees that appropriate tables are empty. No performance overhead here
>(23:38:35) kamstrup: the onto as such will be easier to compute on
This is a duplicate of the above
>(23:38:48) kamstrup: it is a nice invariant that only leaf nodes can
>(23:39:02) kamstrup: and that you can not create sub-fields of leaf nodes
Is this a benefit at all? Especially, considering that xesam onto doesn't
usually define data format-specific stuff(except for email). That is we don't
define ODFDocument, MSOOXMLDocument(sorry for cursing), PDFDocument. We only
define a Document and have no way to know if it will serve us well or will
need to be extended.
>(00:28:27) vandenoever: kamstrup: suppose you have a type 'giraffe' and you
label certain animals with it
>(00:28:52) vandenoever: then you find out that there are subspecies
>(00:28:52) vandenoever: what do you do?
>(00:28:53) vandenoever: change all the old data?
>(00:29:01) kamstrup: add a field "subSpecies"
>(00:30:23) kamstrup: if I did this it would be bad design be me in the
It's not about bad design. It's about(surprise!) simplicity. What you are
proposing is that for every type of data, we need to define an abstract class
and a specific class. eg AbstractTextDocument and SpecificTextDocument.
Otherwise you will be either unable to extend the SpecificTextDocument or
unable to actually create instances of AbstractTextDocument.
>(00:30:51) vandenoever: kamstrup: and humans will always make bad designs
>(00:30:55) kamstrup: if I was not sure that there was exactly one type of
giraffe, then I should have made a group for it
>(00:31:00) kamstrup: yes
>(00:31:04) kamstrup: you are right
>(00:31:04) vandenoever: it's not about perfection, it's about usability
>(00:31:20) kamstrup: and my solution with a subSpecies field is perfectly
ok in my book
>(00:32:47) kamstrup: It goes to show that you may not be able to solve the
problem "correctly", but that you can apply a perfectly fine workaround
You're effectively throwing out semantics with this workaround like generic
inheritance, limitations on what properties can be assigned to a class.
Oh, and can't resist commenting on this wonderful article:
> (00:13:26) kamstrup: vandenoever:
> (00:13:29) kamstrup: it is a must read
> (00:13:33) kamstrup: for any programmer
> (00:13:48) vandenoever: Architecture Astronauts: they are way out there!
So this guy takes some marketing hype, puts in it the mouths of unrelated dev
guys and blames them for it?
(22:31:45) kamstrup: Phreedom: If we can get a word with jamiemcc we might
be able to clear up the thing about abstract fields...
(22:31:58) kamstrup: (both of you : consider the last line a ping)
(22:32:17) jamiemcc: pong!
(22:34:22) kamstrup: ouch
(22:34:33) kamstrup: now we just wait for a pong from Phreedom
(22:34:56) kamstrup: jamiemcc: anyway, the case is the following:
(22:35:30) kamstrup: I am of the belief that we agreed on that fields where
abstract if they had any children
(22:35:41) jamiemcc: yes
(22:35:51) kamstrup: in other words a field is abstract if and only if it
(22:36:17) jamiemcc: yes and is not used in searches
(22:36:30) kamstrup: also meaning that third parties can not extend fields
which does not have any children in the Xesam onto
(22:36:45) kamstrup: moreover I also think we agreed that you can not assign
any value to an abstract field
(22:36:54) kamstrup: (maybe obvious)
(22:36:57) jamiemcc: yes
(22:37:12) kamstrup: good, I think we agree then
(22:37:15) jamiemcc: abstract are like intermediate classes
(22:37:22) kamstrup: yes
(22:37:31) jamiemcc: they ar enot used directly but instead are always
(22:37:36) kamstrup: only leaf nodes of the onto can contain values
(22:37:53) jamiemcc: yes - its unlikely a leaf node would need a child in
(22:38:08) kamstrup: and it will not be able to get one
(22:38:13) kamstrup: without us breaking api
(22:38:18) jamiemcc: consider them final classes!
(22:38:23) kamstrup: yes
(22:38:25) kamstrup: now
(22:38:48) kamstrup: Phreedom is of the opinion that a field with children
is not necesarrily abstract
(22:39:10) kamstrup: s/sar/ssa
(22:40:15) kamstrup: I want to know what Jos expects
(22:40:29) kamstrup: unfortunately I can not find IRC logs about us
(22:40:39) kamstrup: but I am quite sure we agreed on what we just talked
(22:40:43) jamiemcc: Well EG take Media as a class
(22:40:51) jamiemcc: it has 2 children Video and Audio
(22:41:14) jamiemcc: no enetity will explicity be of class media
(22:41:22) jamiemcc: it will be either video or audio
(22:41:26) kamstrup: jamiemcc: Media is a content type not a field
(22:41:42) kamstrup: abstractness only makes sense on fields
(22:41:56) kamstrup: content and source types does not contain data
(22:41:58) jamiemcc: oh ok
(22:42:19) jamiemcc: oh yeah in tracker we use DC as abstract hierarchy
(22:42:26) kamstrup: otoh, an item can not be of an abstract content type
(22:42:42) jamiemcc: DC:Subject never has a value but its children
(22:43:14) kamstrup: right
(22:43:46) kamstrup: jamiemcc: in tracker - is it possible to retrieve the
contents of an abstract field?
(22:43:49) jamiemcc: so is there a counter example?
(22:43:52) kamstrup: (i would expect not)
(22:43:56) jamiemcc: no
(22:44:01) kamstrup: good
(22:44:02) jamiemcc: but you can search with it
(22:44:06) kamstrup: yes
(22:44:18) kamstrup: that is also how Xesam is modelled
(22:44:36) kamstrup: I am just trying to make sure that I understand the
(22:44:44) kamstrup: and that we all agree on what we agree on :-)
(22:46:45) jamiemcc: its unlikely i would agree to allow non leaf nodes to
have values without very good reasons
(22:54:17) kamstrup: I am quite sceptic about it too
(22:54:46) kamstrup: and having this as a restriction in Xesam does not
render us incompatible with Nepo
(23:24:41) # vandenoever появляется на канале #xesam
(23:24:45) vandenoever: ello
(23:24:48) kamstrup: yo
(23:25:06) kamstrup: I discussed the onto with Phreedom the other day
(23:25:17) kamstrup: and it turned out we did not agree on some of the
(23:25:22) kamstrup: and I wanted your opinion
(23:25:35) kamstrup: I have just had a small chat with jamiemcc about it
(23:25:45) kamstrup: Here's what:
(23:26:06) kamstrup: I believe we agreed on that an abstract field was one
(23:26:27) kamstrup: ie, that the only fields which can contain values was
the leaf nodes of the onto
(23:26:51) kamstrup: Phreedom thought otherwise
(23:26:53) vandenoever: kamstrup: i never heard that discussion
(23:26:59) kamstrup: ok
(23:27:11) vandenoever: it seems to me that also branch points can have
(23:27:25) vandenoever: but that was simply my assumption so far
(23:27:28) kamstrup: I was just sure that "we" (for a suitable definition
of "we") discussed this at some point
(23:27:35) vandenoever: are there arguments for or against?
(23:27:37) kamstrup: but I can not find an IRC log of it
(23:27:52) kamstrup: well
(23:27:55) kamstrup: I hae one for
(23:27:57) vandenoever: kamstrup: maybe while i was offline
(23:28:01) kamstrup: perhaps
(23:28:12) kamstrup: consider xesam:legal
(23:28:32) kamstrup: it has a few children xesam:copyright, disclaimer, etc
(23:28:50) vandenoever: also the notion of 'abstract field' is new to me,
(23:28:54) kamstrup: ok
(23:29:09) kamstrup: "abstract" will just mean that it can not have a value
(23:29:32) kamstrup: but you can search them
(23:29:46) vandenoever: so they are only aliases for groups
(23:29:52) kamstrup: right
(23:30:26) kamstrup: a group might be more spot on, because that is really
what an abstract field is
(23:30:36) kamstrup: a collection of related fields
(23:31:16) kamstrup: with this picture it does not make sense to return the
value of xesam:legal
(23:31:27) kamstrup: because - what value to return?
(23:31:44) kamstrup: (without the notion of a canonical child field)
(23:32:12) vandenoever: canonial child field?
(23:32:25) vandenoever: but fields have canonics, right?
(23:32:27) kamstrup: a given one to pick as default within the group
(23:32:36) vandenoever: e.g. we can say a field can only appear once
(23:32:37) kamstrup: (xesam will not have this)
(23:32:46) kamstrup: ?
(23:32:58) vandenoever: e.g. a picture can have only one width
(23:33:31) kamstrup: yes, but I don't see where you are going...
(23:33:36) vandenoever: kamstrup: you're saying xesam:legal can have only 1
(23:33:41) kamstrup: no
(23:33:45) vandenoever: kamstrup: never mind, you lead
(23:33:57) kamstrup: I was trying to say that xesam:legal can not have a
(23:34:06) kamstrup: because it would be ambiguous
(23:35:05) vandenoever: but it would not be if you had a limit on the # of
(23:36:13) vandenoever: i see the difficulty though
(23:36:18) kamstrup: sorry, I still don't get what you mean
(23:36:35) vandenoever: i'm sure there is a good argument somewhere in
literature that decides this question
(23:36:45) kamstrup: that is very likely
(23:36:49) vandenoever: certainly the nepomuk guys know this kind of stuff
(23:37:07) kamstrup: but it is even more likely that there are two good
(23:37:44) kamstrup: my primary arg is simplicity
(23:38:03) kamstrup: query expansion will also be easier
(23:38:35) kamstrup: the onto as such will be easier to compute on
(23:38:48) kamstrup: it is a nice invariant that only leaf nodes can contain
(23:39:02) kamstrup: and that you can not create sub-fields of leaf nodes
(23:39:21) vandenoever: i do not get the latter one
(23:39:33) vandenoever: ah, now i do
(23:39:42) kamstrup: sure?
(23:40:03) vandenoever: yeah, if you allow to assign a value to field X, you
cannot later fork it
(23:40:09) kamstrup: right
(23:40:11) vandenoever: because it would render X abstract
(23:40:15) kamstrup: yes
(23:40:19) kamstrup: an API break
(23:40:32) vandenoever: so this would severly limit the flexibility of your
(23:40:38) kamstrup: yes
(23:40:41) kamstrup: that is the purpose
(23:41:07) kamstrup: it gives a very clear cut image of the onto
(23:41:18) kamstrup: it consists of fields and "groups"
(23:41:27) kamstrup: as you put it
(23:41:33) kamstrup: very easily grokkable
(23:41:40) vandenoever: but it breaks compatibility with the mainstream
(23:42:03) kamstrup: huh? which?
(23:42:24) vandenoever: e.g. dublin
(23:42:53) kamstrup: afaik there is no grouping/hierarchy in dc
(23:43:43) vandenoever: goes to check
(23:44:41) kamstrup: also just checked
(23:45:23) vandenoever: 'Narrower Than:' means 'subcategory'
(23:45:38) vandenoever: e.g. Image and MovingImage
(23:45:48) kamstrup: ah, ok
(23:47:14) kamstrup: well
(23:47:31) kamstrup: I think it is up to interpretation whether or not you
can assign the type Image to something
(23:47:49) kamstrup: or if you have to use one of its children
(23:48:05) kamstrup: clearly any image is either Still or Moving
(23:48:36) kamstrup: so it would be odd to assign only Image and not
(23:49:17) vandenoever: the narrower/broader is only used once, so it's hard
(23:49:22) kamstrup: yes
(23:50:07) kamstrup: consider again xesam:legal
(23:50:08) vandenoever: question if a 'super-class' is instantiable
(23:50:13) kamstrup: what is the use?
(23:50:20) kamstrup: what should I assign to it?
(23:50:52) kamstrup: is it when I find legal info that does not fit one of
(23:50:59) kamstrup: that would be a bad idea at least
(23:51:23) kamstrup: because I might overwrite valuable info somebody else
(23:51:46) vandenoever: let's consider actor, director, painter, singer
(23:51:52) vandenoever: all of them are artists
(23:51:59) vandenoever: some are actor and painter
(23:52:26) vandenoever: if i have folder with movies and songs, i'd like to
have a column 'artist'
(23:52:41) vandenoever: and not 'director' and 'singer' in separate columns
(23:52:46) vandenoever: (not always at least)
(23:53:35) vandenoever: to me it seems these ontologies can be so broad in
(23:54:35) vandenoever: there may be computational advantages now, but
sooner or later practical problems will turn up
(23:54:50) kamstrup: in your case I think the app would detect the media
(23:55:00) vandenoever: let's consider the numbers
(23:55:05) kamstrup: and display the matching child of artist
(23:55:17) vandenoever: http://en.wikipedia.org/wiki/Number
(23:55:50) vandenoever: the set of rational numbers is part of the set of
(23:56:02) vandenoever: but not all real numbers fall into some other
(23:56:22) kamstrup: yes they do
(23:56:29) vandenoever: so if you make an ontology of numbers, you cannot
put pi somewhere
(23:56:33) kamstrup: either you are rational or irrational
(23:56:38) kamstrup: that goes for humans too ;-P
(23:56:46) vandenoever: damn, wrong example
(23:56:48) vandenoever: oops
(23:57:05) vandenoever: so we have the complex numbers and the uncomplex
(23:57:07) kamstrup: you are talking to a math student
(23:57:39) vandenoever: ok, so where do you place '1' ?
(23:57:51) kamstrup: on the real line
(23:57:58) kamstrup: I see
(23:58:05) kamstrup: that 1 is also a complex number
(23:59:17) vandenoever: wonders if there is an ontology of numbers
(23:59:24) kamstrup: well
(23:59:30) kamstrup: in some sense there is
(23:59:39) kamstrup: it is called number theory
(23:59:44) kamstrup: :-)
(23:59:47) vandenoever: is it computer parsable?
(23:59:52) kamstrup: nope
(00:00:05) kamstrup: it does not have finitely many elements
(00:00:18) kamstrup: infact I suspect it has uncountably many
(00:00:33) vandenoever: what, more than 100?
(00:00:38) vandenoever: wow!
(00:00:41) kamstrup: yes
(00:00:53) kamstrup: it is bigger than G_MAX_ULONG
(00:00:55) vandenoever: i'd say number theory is a good benchmark for
(00:01:01) kamstrup: hehe
(00:01:06) vandenoever: s/ontologies/ontology frameworks/
(00:02:24) kamstrup: Hmmm, maybe I should try and come up with something
that could model the set of number fields
(00:02:34) kamstrup: it is dangerous ground here
(00:02:55) kamstrup: because in math a "field" is actually a set of number
(00:02:59) kamstrup: numbers
(00:03:21) vandenoever: kamstrup:
(00:04:07) kamstrup: what about it?
(00:04:12) vandenoever: look at 'decimal' and 'integer'
(00:04:33) kamstrup: eeek
(00:04:53) vandenoever: actually, the whole hierarchy was meant as an
example, but i just noticed that one
(00:05:11) kamstrup: this is the exact thing I am trying to avoid for Xesam
(00:05:24) vandenoever: kamstrup: it is possible to make in instance of each
of those fields
(00:05:28) kamstrup: it looks like an example of twenty guys picking on
(00:05:57) kamstrup: I am more pragmatic
(00:05:58) vandenoever: kamstrup: it's also quite pervasive by now
(00:06:04) kamstrup: yes
(00:06:06) kamstrup: I know
(00:06:20) kamstrup: but that does *not* mean that it is good
(00:06:26) kamstrup: Java is also quite pervaive
(00:06:40) kamstrup: but almost everybody agree that it has a very messy api
(00:06:43) vandenoever: kamstrup: and java is also very nice
(00:06:51) kamstrup: yes
(00:07:09) kamstrup: but why does ByteArrayInoutStream throw an IOException
(00:07:10) vandenoever: so we continue in esperanto?
(00:07:23) vandenoever: ontologies are like languages and languages are
(00:07:26) kamstrup: sorry, crying kid, 2s
(00:07:52) vandenoever: perfect onto is impossible, same is true for perfect
(00:08:06) vandenoever: see GEB
(00:08:29) kamstrup: back
(00:09:45) kamstrup: vandenoever: I read a good quote by one of the MS
(00:10:15) vandenoever: kamstrup: funny
(00:10:21) kamstrup: "Make sure that the simple stays simple, but don't be
tempted to make the complex simple"
(00:10:36) kamstrup: (very paraphrased)
(00:10:38) vandenoever: prefers the einstein quote
(00:11:00) vandenoever: also it does not say anything
(00:11:24) kamstrup: my quote or the einstein quote?
(00:11:30) vandenoever: both :-)
(00:11:39) kamstrup: he
(00:11:51) vandenoever: einstein: "make everything as simple as possible,
but not simpler"
(00:12:00) kamstrup: that is not the same
(00:12:39) kamstrup: vandenoever: have you read Joel On Software,
about "Architecture Astronauts"?
(00:12:44) vandenoever: so how is the ms quote relevant?
(00:12:46) vandenoever: no
(00:12:58) kamstrup: that is one of my all time favorite blog posts
(00:13:26) vandenoever: i agree that we should KISS, but apparently whether
adding restrictions makes things simpler
(00:13:26) kamstrup: vandenoever:
(00:13:29) kamstrup: it is a must read
(00:13:33) kamstrup: for any programmer
(00:13:48) vandenoever: Architecture Astronauts: they are way out there!
(00:14:20) vandenoever: kamstrup: i agree with the post already, i prefer to
(00:16:43) vandenoever: kamstrup: he's mixing architecture and framework in
(00:16:57) kamstrup: yeah, he is not always 100% right
(00:17:06) kamstrup: but he has a lot of really good points
(00:17:27) vandenoever: so, back to the issue
(00:17:56) vandenoever: i'd prefer to keep the xesam onto similar to the
(00:18:13) vandenoever: and also similar to rdfs ontos
(00:18:34) vandenoever: because it would allow us to reuse other peoples
(00:18:44) vandenoever: and actually try to be compatible
(00:18:59) kamstrup: I want to be Nepo compat too
(00:19:02) kamstrup: fear not
(00:21:38) kamstrup: is digging in to the Nepomuk docs
(00:22:04) kamstrup: and he is very intimidated by the extent of the
(00:24:48) kamstrup: vandenoever: Consider
(00:25:11) kamstrup: Looking at the commentary on that makes me say that you
can not assign a value to it
(00:25:45) kamstrup: it does however have the data type of string
(00:25:52) kamstrup: which I must consider a bug
(00:25:59) kamstrup: or a bug in the comment string
(00:27:11) vandenoever: kamstrup: NIE is defined in NRL:
(00:27:20) kamstrup: I would consider it outright dangerous if applications
startedd to put stuff in nie:legal
(00:28:27) vandenoever: kamstrup: suppose you have a type 'giraffe' and you
label certain animals with it
(00:28:52) vandenoever: then you find out that there are subspecies
(00:28:52) vandenoever: what do you do?
(00:28:53) vandenoever: change all the old data?
(00:29:01) kamstrup: add a field "subSpecies"
(00:30:23) kamstrup: if I did this it would be bad design be me in the first
(00:30:51) vandenoever: kamstrup: and humans will always make bad designs
(00:30:55) kamstrup: if I was not sure that there was exactly one type of
giraffe, then I should have made a group for it
(00:31:00) kamstrup: yes
(00:31:04) kamstrup: you are right
(00:31:04) vandenoever: it's not about perfection, it's about usability
(00:31:20) kamstrup: and my solution with a subSpecies field is perfectly ok
in my book
(00:32:47) kamstrup: It goes to show that you may not be able to solve the
problem "correctly", but that you can apply a perfectly fine workaround
(00:34:15) vandenoever: that is incompatible with a solution some else might
(00:34:53) kamstrup: by "incompatible" you mean that there is no direct 1-1
map. But I can still create a script or something that does a clean 1-1 map
(00:35:04) kamstrup: after all we are describing the same data set
(00:35:17) kamstrup: and I will have all relevant info
(00:36:19) kamstrup: vandenoever: My take on it is like this
(00:36:40) kamstrup: using groups is like using interfaces in Java
(00:37:03) kamstrup: allowing fields with children to have values is like
using classes all the way
(00:37:18) kamstrup: ie, never use java interfaces, but do subclassing
(00:37:51) kamstrup: sloppy programming
(00:38:20) vandenoever: but i do not want to go mapping
(00:38:39) vandenoever: not too much anyway and not in a complex way like
(00:38:58) kamstrup: I don't see why we would have to map to anything
(00:40:49) vandenoever: kamstrup: i need some sleep, shall we ask Phreedom
to let the nepomuk guys talk a bit about this?
(00:40:57) vandenoever: kamstrup: are you coming to fosdem?
(00:41:46) kamstrup: vandenoever: No, unfortunately not
(00:42:07) vandenoever: ah, too bad, trueg is coming and also one of the
(00:42:28) kamstrup: I am thinking that we should use some conferencing
software at some oint
(00:42:39) vandenoever: anyway, i'll catch you later, it's good to discuss
stuff like this
(00:42:40) kamstrup: to talk "face to face" all of us
(00:42:48) kamstrup: sleep tight
(00:42:50) vandenoever: kamstrup: i've an n800 :-)
(00:42:56) kamstrup: bastard!
(00:43:03) kamstrup: show off!
(00:43:08) vandenoever: i look really ugly on it
(00:43:21) kamstrup: hehe, we can turn video off :-)
(00:43:35) vandenoever: ok zzz
(00:43:35) # vandenoever: Не в сети.
(00:43:35) # vandenoever покидает канал (Выход: "Remote closed the
(00:53:34) # kamstrup: Не в сети.
(00:53:34) # kamstrup покидает канал (Выход: "kornbluth.freenode.net
(00:53:36) # kamstrup появляется на канале #xesam
(01:00:31) kamstrup: jamiemcc: Have you started the Xesam impl? If not I
have a few things to discuss with you about xesam-glib
(01:01:04) kamstrup: I am thinking about (or more correctly, I am going to)
putting some code there for implementing a basic server
(01:02:02) kamstrup: if you have not started, I wanted to hear if you where
interested in using it (if not Tracker's need is to specialized for a general
(01:02:18) kamstrup: it is just that a lot of the session/search book
keeping can be generified
(01:09:38) # kamstrup: Не в сети.
(01:09:38) # kamstrup покидает канал (Выход: ""Ex-Chat"" ).
More information about the Xesam