MIME info spec: Handling containers/multiple MIME types per glob pattern

Christian Neumair chris at gnome-de.org
Wed Nov 16 10:38:20 PST 2005

On Mi, 2005-11-16 at 09:41 -0500, Matthias Clasen wrote:
> On Wed, 2005-11-16 at 15:15 +0100, Christian Neumair wrote:
> > Matthias Clasen wrote:
> > >The patch that I posted a few weeks ago only sniffs if there
> > >are multiple identical globs that match, which is a fairly rare
> > >case in the current shared-mime-info data (only, .pot and .pcf,
> > >if I remember correctly).
> > >  
> > Sounds like excellent material. It is available under [1], just for the 
> > reference.
> > XDG would return XDG_MIME_TYPE_UNKNOWN if the fopen fails, and multiple 
> > MIME types specify identical matching glob patterns, right?
> Hmm, you are right. It might be better to return the first matching
> glob's mimetype in that case.

No, that's not what I wanted to say. I think it is perfectly OK if not
required to return XDG_MIME_TYPE_UNKNOWN in that case. It is just not
possible to tell the actual MIME type from two identical globs, and
predetermining one is just broken.

> As you said in an earlier mail, it might also be valuable to expose the
> multiple glob matches via the api in some way, so applications can
> decide on their own how to handle this case.

Yeah, a function providing a maximum number of matches, also allowing to
specify -1 for all matches would be appropriate. It should be exported
to xdgmime.h. It would be a wrapper around the caches involved, of

> > While this still wouldn't work for the more complex matches "README*", 
> > "*EAD*", "*DME", it takes care that container formats (ogg, avi) or glob 
> > patterns used by multiple mime TYPES "*.pot" friends play nicely with 
> > Nautilus. Since KDE AFAIK currently does the same (glob matching, 
> > contents for particular container MIME types only), it would also be 
> > useful for them.
> I think only exaclty identical globs should be considered as duplicates
> (ie if both *.gz and *.tar.gz match, the longer one is better), and
> anything but suffix patterns are too rare to worry about (what mimetype
> could be interested in matching *EAD* ?)

I agree with you, it is probably not worth the pain, since the
performance impact can probably be measured in orders of magnitude,
taking into account that we use a pretty fast algorithm for looking up
simple globs and literals.

> > It would be nice to get this in, and to get some feedback from potential 
> > XDG MIME API client projects. Are you willing to patch write a spec 
> > patch yourself, or should I tackle this?
> > 
> If you write a spec patch, I would be happy to review it. I think the
> sections about recommmended matching order and the cache file format
> descriptions need modifications.

I'm appending a patch that just adds a paragraph on the multiple MIME
type for one pattern case, and added comments about the acronym changes
I made some weeks ago.

Having a new shared-mime-info release within 4 weeks would be great,
that should be enough time for final cosmetic changes, and translation
updates. We're < 1.0, so why not release early and often? The last
release is from 2004-03. A huge amount of bugs was fixed since then.

Any major TODOs you'd like to point out?

Christian Neumair <chris at gnome-de.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shared-mime-info-spec.xml.diff
Type: text/x-patch
Size: 2923 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/xdg/attachments/20051116/03770cce/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.freedesktop.org/archives/xdg/attachments/20051116/03770cce/attachment.pgp 

More information about the xdg mailing list