multiple mime-types for the same file?

Dave Cridland [Home] dave at cridland.net
Tue May 25 02:27:11 EEST 2004


On Mon May 24 18:24:25 2004, David Faure wrote:
> On Monday 24 May 2004 18:39, Dave Cridland [Home] wrote:
> > There are ones listed without the 'x-' prefixed that are not 
> listed > at IANA, such as 'application/illustrator', 
> 'application/smil', etc. Submit bug reports, apparently that's the 
> preferred way.
> 
> 
Okay... I parsed the application registry from IANA (after manually 
editing the truly vile HTML to make it parse. Yuk.):

IANA has 351. [All registered, obviously.]
XDG-0.14 has 198. [46 in registered format, I think - not x. or x-]
In common: 22
XDG-0.14 registered format, not in IANA: 24

I think this is accurate, but I'm not 100% yet.

I'll checkout CVS in the morning and start on the others.

Then I'll start going through the bogus types finding matches, and 
the x-prefixed types looking for registered variants - in that order.

I didn't think it was quite this bad, to be honest.


> > I think most of the vendor tree ones are made up > 
> 'application/vnd.corel-draw' is, for instance, but would, I think, 
> be > registered as 'application/vnd.corel.draw' anyway, unless it 
> became > an 'image' type, of course. I don't expect to see any of 
> these in the > database at all, it's ridiculous that they're there.
> Why? We do need to identify all types of files.
> 
> 
Point being it's not in IANA, so it needs to be an x-prefixed type. 
The fact a media type for Corel Draw! is there at all is fine, it 
just has a media type specified which doesn't exist, and requires 
registration. Since writing that, I did notice that many of the 
earlier vnd. registrations are in that format, so much of that 
paragraph was incorrect, though.


> > A shared content/media/mime type database is a fantastic idea, 
> don't > get me wrong, it's just that there already is one, 
> maintained by IANA.
> Without translations, without icons, without any control from us to 
> quickly add
> what we need, and without a whole bunch of mimetypes that we need.

I agree the IANA database has no translations, I specifically stated 
that in the immediately following sentence, which you cut. Hence I 
agree to additional information, although in many ways I'd prefer it 
if the IANA database were expanded, as I also said.

Otherwise, the data is all at IANA where the types are registered, 
although not generally in computer readable form, which I also 
expressed regret over.


> And it's just not as easy as pointing them to our .xml file, you 
> need to
> register each and every mimetype separately, with details about what
> this file type is about, etc. If you feel like it, feel free to go 
> ahead.
> 
> 
Damn them IANA people, why can't they adhere to our standards. ;-)

But anyway:

1) You can register MIME types via email, multiply if you so choose. 
There's an online form to make single ones easier.

2) Many of the details for registered types are actually pretty 
sketchy, but it would be nice if we had good information, and I think 
they're stricter about it than they were.

3) It's not a quick process, no, nor is it intended to be, since it's 
a definitive list, and mistakes would be bad.

It'd be fantastic to register them all, of course, but I very much 
doubt I've the knowledge of the formats to do them justice. I'll take 
a look and see which ones I can do, and contact the various projects 
about doing so.


> > It gets silly to then end up contradicting the IANA one, and > 
> apparently failing to understand what some of the existing content 
> > types actually are.
> I agree - when this happens, it has to be fixed. And simply ranting 
> about it
> in general every N months doesn't really fix anything :)
> 
> 
Sure this is a rant. I know that. :-)

But it's a rant which hasn't been repeated every N months, I've only 
mentioned it once before, and everyone else but you ignored me 
totally that time, too.

> > But unless you can be certain that all incoming email is being 
> sent > by applications using the same version of that database, 
> then an > x-prefixed type is equivalent to application/octet stream 
> by > definition.
> For all particular purposes, though, if you receive e.g. a .java 
> file and
> your mailer has no idea what to do with it, what happens? You save 
> it,
> and your desktop recognizes the extension (or content) and the file 
> is
> meaningful again. What I'm saying here, is that it's not such a big 
> deal
> that mimetypes don't match on the environments at two ends of a 
> mail.

The filename is a suggested one. It might not even be present. 
Moreover, it may be incorrect, and/or even deliberately misleading, 
and although your example is innocuous enough, some active content 
may well have very different security characteristics between email 
and filesystem. Ignoring the content type not only loses information, 
because, for instance, character set information might be lost, but 
it also loses security knowledge. I can be pretty certain that if you 
send me a cryptographically signed message with some content in it, 
that that content is safe. I cannot be so certain if you didn't sign 
the message, and if it came from someone I'd never heard of, it has 
to be treated with heavy suspicion. All that information gets lost 
the moment I stick it on a filesystem.

In summary, desktops treat a file on the filesystem as being 
relatively safe, whereas a file inbound via email may not be.

So the filename needs to be abandoned - all the more so if the email 
client cannot detirmine the type with confidence - and if the content 
type cannot be detirmined, then no fully automated launch should be 
possible and it needs to be left to the user to take some active 
action to make the file 'work' again. To do otherwise tends to open 
the field for virus/worm writers, as happened on the Windows platform 
as a result.

Of course, a content-disposition value of 'evil', to compliment the 
evil bit in the IPv4 header, would solve all this.


> This, I don't agree with. The point of standardizing which x-foo 
> mimetypes
> we use is "how to make the free environments work similarly and 
> make the
> transition to that as painless as possible". Renaming everything 
> defeats
> that purpose. If we take the effort to rename things, we might as 
> well start
> by registering proper mimetypes for things.

True enough, if it weren't that as you said earlier, nobody wants the 
hassle of IANA.

My suggestion then as now is to build a lighter weight registry, 
since we need all the aliasing capability anyway.

But hey, we have what we have.

Dave.




More information about the xdg mailing list