extended attribute standardization
Claes at work
claesatwork at gmail.com
Fri Nov 24 01:25:59 EET 2006
On 11/18/06, Stephen Watson <stephen at kerofin.demon.co.uk> wrote:
> On the ROX side, ROX-Filer uses and can set user.mime_type, ROX-Lib contains
> functions for setting and reading arbitrary xattrs, the next release of
> ROX-CLib will contain similar functions and the next release of Fetch can
> set user.xdg.origin.url.
That is certainly good news! But before doing this, please read on.
I have made some additions to the wiki page. Regarding the earlier
suggestion to use Dublin Core I think it makes sense if xdg uses the
same names and semantics for attributes as Dublin Core as far as
possible, but with a somewhat different context.
Basically I propose that the user.xdg namespace should borrow Dublin
Core semantics where applicable, and apply it to metadata that can be
derived from circumstances outside of the actual file contents. Things
that the saving application knows of but the file not.
The user.dublincore namespace should be used for things that are
already stored in the file, in situations where it is useful to
duplicate it. For example a Dublin Core-extractor could crawl and
extract such metadata from files and add it to them as xattrs in the
user.dublincore namespace. Then a web server could supply that
metadata fast and easily if in the future some standard emerges to
supply Dublin Core metadata in HTTP headers, for example. (Apache
mod_mime_xattr does something similar)
An example to illustrate the distinction: Dublin Core defines the
element "Source" as "A Reference to a resource from which the present
resource is derived. The present resource may be derived from the
Source resource in whole or part. Recommended best practice is to
reference the resource by means of a string or number conforming to a
formal identification system." For a OpenDocument file, the value
could be something like "Image from page 54 of the 1922 edition of
Romeo and Juliet" (which is taken from "Using Dublin Core",
http://dublinCore.org/documents/usageguide/ ). However, if the file
was sent as attachment in an email, it can also be said to be derived
from this email. I initially suggested using the user.xdg.origin.*
attributes for this scenario, but now I wonder if not it should be
retrofitted to use user.xdg.source.* instead. I would like feedback on
this, especially if it is in use somewhere.
I also discovered that the usage of "Creator" differs between current
practise in ROX Contact Manager and Dublin Core. In the first case it
means the application that saved the file ("Contact") and in the
second case the person/organisation that created the content. Dublin
Core defines "Publisher" : "The entity responsible for making the
resource available. Examples of a Publisher include a person, an
organization, or a service. Typically, the name of a Publisher should
be used to indicate the entity." If we interpret the application as
being a service, then we could use user.xdg.publisher as the name of
the application that saved the file, rather than user.xdg.creator.
Language is a useful attribute for indexers, as was mentioned in
another thread. It can also be derived in many cases when a file is
downloaded using HTTP. Language is also defined in Dublin Core. I
added user.xdg.language as benefit for indexers.
Finally I added some ideas how to tag files and directories to control
indexing or backup using xattrs, inspired by how it works on the web.
More information about the xdg