mime-type/application mapping spec, take #2
Dave Cridland [Home]
dave at cridland.net
Tue Jul 1 12:48:52 EEST 2003
On Mon, 2003-06-30 at 12:20, David Faure wrote:
> On Monday 30 June 2003 12:57, Dave Cridland [Home] wrote:
> > On Sun, 2003-06-29 at 00:24, David Faure wrote:
> > > For the generic handler suggestion, I suggest another solution:
> > > mimetype inheritance. Has this been added to the mimetype spec? I remember
> > > we talked a bit about it. It would allow to let all text-based mimetypes inherit
> > > from text/plain, which would also mean they inherit its handlers.
> >
> > Is this different from defining a catch-call text/* handler, or even */*
> > handler? (*/* handlers are useful for, for instance, "Mail this
> > content".)
>
> Yes - application/x-perl and application/x-shellscript (for instance)
> are text-based formats, so they could derive from text/plain.
One could equally argue that Perl, Shell, and indeed most computer
language source files are "principally textual in form", and thus should
be "text" type, depending on your reading of RFC2046.
A rapid Google suggests I'm not alone in thinking this - BeOS apparently
did this.
My reading of RFC2046 is essentially that application type is used for
things which require a specific application to make any use of the
encapsulated data, wheras if it's reasonable to assume that humans exist
who can understand it by displaying it in raw form on a terminal, it's
"text" type.
I'd actually humbly suggest that random computer source code is more
likely to be "readable" in its raw form than random HTML or XML, both of
which have "text" type. Well, one hopes.
Treating program language source code as "text" also has the additional
advantage that receiving a "text/x-ruby" source code file can be
immediately used by a programmer whose system has no a priori knowledge
of Ruby source files, whereas "application/x-ruby" would have to be
treated as "application/octet-stream" if it were unrecognised.
Moreover, a file containing Ruby source code that has had no specific
MIME media type associated with it (perhaps obtained via /usr/bin/ftp or
some such) will be recognised by /bin/file or libmagic or whatever as
being "text" - most likely "text/plain". I think it's better to be
consistent with this.
This certainly seems to fit with the general intentions of the RFC, but
others may disagree. That's fine, by the way, I'm always happy to be
disagreed with, and indeed even happier to be proven wrong. I'm a
perfectly gracious loser, having had so much practise. :-)
As a wild aside, I have no idea whether it's "image/x-ascii-art" or
"text/x-ascii-art", although I suspect the latter. ;-)
> Basing such things on the mimetype name (text/foo) is very unflexible.
^ only (I assume)
I'm happy to agree, if only I could think of an example which didn't
simply suggest to me that the media type had been chosen incorrectly.
You're after a relatively high degree of complexity here, and I'm not
certain that's actually needed - just more thought on choice of media
type.
I can, however, think of instances where an externally received file is
tagged by some mechanism as "application/x-sh", which by the above
argument could be considered incorrect, and also differs from your own
example. In such cases, it would be useful to have an equivalence
mapping which would change the media type to a more suitable one. (In my
opinion, "text/x-sh", in yours, "application/x-shellscript")
Of course, it needs to be somehow agreed which non-standard media types
we're standardising on, if you see what I mean. On this front, there's
little stopping anyone from defining a namespace trick to avoid clashing
"X-" subtypes - just define "our" subtypes to begin with "X-XDG." for
instance, and setup a registry. (Assuming we're "consenting systems", of
course. I love that phrase.)
This would mean that incoming "application/x-shellscript",
"application/x-sh", and "text/x-sh" all got translated into
"text/x-xdg.sh" for processing, for sake of example. Perhaps better, we
could make it heirarchical, so you get "text/x-xdg.prog.sh", such that
we know it's a programming language. Possibly a waste of time.
Phew. I've written far more than I set out to, but I hope it's useful.
Things I haven't mentioned:
- Media types have parameters, we do have to worry about these.
Example: "text/plain; charset=foo" should, according to the spec, be
treated as application/octet-stream. (ie, opaque data).
- Possibility of using URI syntax in "MimeType" field in .desktop for
better advertising of capability, eg:
action:text/plain;actions=view,edit;scheme=file,http;charset=US-ASCII,UTF-8
- Do actions inherit? Are all editors also viewers? Hmmm... Not with
HTML I suppose.
- Can we replace existing XML based database with filesystem based one,
containing (links to) .desktop files?
Dave.
More information about the xdg
mailing list