mime-type/application mapping spec, take #2

David Faure dfaure at trolltech.com
Tue Jul 1 15:34:29 EEST 2003

On Tuesday 01 July 2003 11:48, Dave Cridland [Home] wrote:
> One could equally argue that Perl, Shell, and indeed most computer
> language source files are "principally textual in form", and thus should
> be "text" type, depending on your reading of RFC2046.
> A rapid Google suggests I'm not alone in thinking this - BeOS apparently
> did this.
> My reading of RFC2046 is essentially that application type is used for
> things which require a specific application to make any use of the
> encapsulated data, wheras if it's reasonable to assume that humans exist
> who can understand it by displaying it in raw form on a terminal, it's
> "text" type.
> I'd actually humbly suggest that random computer source code is more
> likely to be "readable" in its raw form than random HTML or XML, both of
> which have "text" type. Well, one hopes.
> Treating program language source code as "text" also has the additional
> advantage that receiving a "text/x-ruby" source code file can be
> immediately used by a programmer whose system has no a priori knowledge
> of Ruby source files, whereas "application/x-ruby" would have to be
> treated as "application/octet-stream" if it were unrecognised.
> Moreover, a file containing Ruby source code that has had no specific
> MIME media type associated with it (perhaps obtained via /usr/bin/ftp or
> some such) will be recognised by /bin/file or libmagic or whatever as
> being "text" - most likely "text/plain". I think it's better to be
> consistent with this.
> This certainly seems to fit with the general intentions of the RFC, but
> others may disagree. That's fine, by the way, I'm always happy to be
> disagreed with, and indeed even happier to be proven wrong. I'm a
> perfectly gracious loser, having had so much practise. :-)

Whichever the arguments are to call it text/perl instead of application/perl,
please understand that we're not the ones who decide what the standard
names are. So if the IANA decides that it shall be application/perl, and if
we settled on the idea of basing heuristics for default viewers on text/*,
then we'll have no way to fix the problem. That's why I call this unflexible.
Names are one thing (out of our hands); behaviour is another thing (in our

> > Basing such things on the mimetype name (text/foo) is very unflexible.
>                     ^ only (I assume)
> I'm happy to agree, if only I could think of an example which didn't
> simply suggest to me that the media type had been chosen incorrectly.
Or has been chosen to be so for other reasons.

> You're after a relatively high degree of complexity here, and I'm not
> certain that's actually needed - just more thought on choice of media
> type.
I suppose you're referring to the inheritance idea? We need it for many
other reasons: to be able to rename a mimetype (and install an alias
for the older name), to have specialized folder types (like e.g. an SMB
host is almost like a directory, but with a special icon), etc. etc.
There are many cases of mimetype inheritance, this isn't just an idea
to "add complexity". We've been needing this for years.

> I can, however, think of instances where an externally received file is
> tagged by some mechanism as "application/x-sh", which by the above
> argument could be considered incorrect, and also differs from your own
> example. In such cases, it would be useful to have an equivalence
> mapping which would change the media type to a more suitable one. (In my
> opinion, "text/x-sh", in yours, "application/x-shellscript")

Yes, the HTTP Content-Type header is another case where mimetype aliases 
are needed.

> Of course, it needs to be somehow agreed which non-standard media types
> we're standardising on, if you see what I mean. 

This is the topic of another standard discussed on this list...

> On this front, there's 
> little stopping anyone from defining a namespace trick to avoid clashing
> "X-" subtypes - just define "our" subtypes to begin with "X-XDG." for
> instance, and setup a registry.

Oh no, no, no. Please no.
We have enough mimetype renaming already, when a x-foo mimetype gets
accepted by the IANA (then we need to rename it to remove the x-) - we just
had to do that with application/ogg... There: another example of strange mimetype
naming. It's not audio/ogg, it's application/ogg. IIRC because this can be applied
to more than audio. However currently, all ogg files I know of, are audio...

So if we also have x-xdg-foo, we have another layer in there, with even more
mimetype renaming - and incompatibility with all existing systems, including the
current kde/gnome versions, and apache, and... everything else.

> Things I haven't mentioned:
>  - Media types have parameters, we do have to worry about these.
> Example: "text/plain; charset=foo" should, according to the spec, be
> treated as application/octet-stream. (ie, opaque data).
Why shouldn't it be treated like text/plain with charset=foo, when possible?
I'm not following.

>  - Can we replace existing XML based database with filesystem based one, 
> containing (links to) .desktop files?
?? The current move is rather the other way round, at least for the mimetype spec...

David FAURE, faure at kde.org, sponsored by Trolltech to work on KDE,
Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).
Qtella users - stability patches at http://blackie.dk/~dfaure/qtella.html

More information about the xdg mailing list