mime-type/application mapping spec, take #2

David Faure dfaure at trolltech.com
Mon Jul 7 20:09:49 EEST 2003

On Tuesday 01 July 2003 19:08, Dave Cridland [Home] wrote:
> > > You're after a relatively high degree of complexity here, and I'm not
> > > certain that's actually needed - just more thought on choice of media
> > > type.
> > I suppose you're referring to the inheritance idea? We need it for many
> > other reasons: to be able to rename a mimetype (and install an alias
> > for the older name), to have specialized folder types (like e.g. an SMB
> > host is almost like a directory, but with a special icon), etc. etc.
> > There are many cases of mimetype inheritance, this isn't just an idea
> > to "add complexity". We've been needing this for years.

I just found more cases for mimetype inheritance.
text/docbook is a special case of text/sgml, for instance.
If you have no application that can handle docbook, then you want
to see those that can handle sgml, since they'll work fine with a docbook file.
The same can be said for all sgml variants, including all xml variants, etc.
We have a real inheritance tree there.

> Media types do need some form of canonicalization before any application
> does anything with them, hence my somewhat pie-in-the-sky suggestion of
> a formal registry at XDG for them. 

Which is exactly what is provided in http://www.freedesktop.org/standards/shared-mime-info
(get the tarball to see it).

> As for SMB hosts being "like" a directory, you surely mean "can be
> presented to the user like", since the actual access methods used are
> wildly different.
No, not for us at least. You click on it, it enters it, adds it to the URL,
and asks the kioslave (VFS thingie) to list it. To the file manager it really
is a directory.

> 1) Vorbis is the sound format of Ogg, Ogg does indeed have the stated
> aim of producing a video codec, and thus should be "application", since
> "audio" files can't contain anything but "audio".
> 2) This means that in this case, treating "application/ogg" as if it
> were "audio/ogg" could lead to annoying side-effects later.
> 3) For now, inheritance could help - but inheritance from what, and what
> attributes?
> 4) In the future, when Ogg files do indeed carry video as well, does
> this indicate we should provide for multiple inheritance?

Ok, it was a bad example (sorry, I don't know much about Ogg/Vorbis).
Better restart this discussion with the docbook vs sgml example.

> 5) Relying on anything beginning "x-" in the IETF world to stay stable
> is asking for trouble, sorry.  Hence my suggestion of a registry - at
> least we'd have some stability there.
See above - we have that already.

> But hang on... Given that there is no standard at the moment, isn't this
> going to happen anyway to an extent? (Minor point, incidentally, I'd
> suggested terminating the prefix with a dot, since that's how the
> current prefixing operates within IANA.)
... but not what the major environments do right now. Sorry for being conservative,
but any change here has a HUGE impact on all the existing software.

> Agreed, Apache may well tell us an object is of a certain media type
> which isn't a "formal" XDG type, but equally, the canonicalization
> should catch this. By specifying a prefix to the XDG
> standard-but-yet-not-standard media types, we can be reasonably certain
> that we're getting what we expect.
Or by all sticking to the list of mimetypes provided in the shared-mime-info "standard".

> [rather abstract stuff about Semantics snipped]

> B) Media types
> 1) We need some method for canonicalization of existing MIME media
> types, such that all XDG conformant environments agree on the same set
> of media types, modulo environment specific types.
See shared-mime-info.

> 2) Whatever method we use for canonicalization, it needs to cope with
> possible media type name changes, due to IANA registrations.
Yes, unfortunately.

> 3) The agreed set of non-IANA media types should be held within some
> form of registry.
>  - Which we may have, however, I'm not sure from the specification.
We do have.

> 4) The agreed set of non-IANA media types should be prefixed to avoid
> potential collisions.
>  - I still like this idea. :-)
And I don't like the idea of breaking everything currently done by KDE, Gnome
Apache and more (in fact almost everything). This makes no practical sense,
only theoretical sense.

> 5) Where a specific subtype is not known to the system, the system may
> choose a default based on the top level type, if one is defined.
I would prefer the much more fine-grained approach of mimetype inheritance,
as outlined above.
There's no guarantee that your "audio player" can handle audio/newtype,
so giving it audio/* doesn't sound too good (no pun intended :).
On the other hand, giving text/docbook to a text/sgml application, or to
a text/plain application (text/sgml inheriting from text/plain), is much safer.

> A "text/*" type with an unknown charset has to be treated as
> "application/octet-stream" by the system. RFC2046, 4.1.4.
OK. Hopefully this is a very very rare case though :) We know about a LOT
of charsets in Qt/KDE, at least. I've never seen this problem happen.

David FAURE, faure at kde.org, sponsored by Trolltech to work on KDE,
Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).
Qtella users - stability patches at http://blackie.dk/~dfaure/qtella.html

More information about the xdg mailing list