Shared-mime checking order

Alexander Larsson alexl at redhat.com
Tue Oct 16 00:59:06 PDT 2007


On Mon, 2007-10-15 at 16:59 +0200, David Faure wrote:
> Hi,
> 
> Wow it's been a month already... been too busy to be able to answer before, sorry about that.

No problems. I've been pretty busy too. :)
> 
> > I think this sounds fine to me. There is only one more thing that I
> > think needs to be resolved. What mimetype do we pick on a glob conflict
> > if we only know the name (i.e. if we can't sniff). Should we add a
> > priority thing? Use the order in the files?
> 
> Good point. A priority would be the best way to ensure consistent results. E.g. we can
> probably all agree that ftp://bar/foo.doc should have a msword icon, because it's
> just more common than "text files named .doc".
> It's going to be confusing reading the xml spec though, if it has both priorities for
> magic and completely unrelated priorities for extensions, and worse, the extensions'
> priority is only used in case of conflicts between extensions...
> Any thoughts? I would be ok with <glob pattern="*.doc" conflictPriority="10"/>

Yes, that sounds good to me. 

> > If we add priority to the glob tag (with some default if its not set) we
> > might be able to handle this in a backwards compat way by having the
> > priority affect the sort order in "globs" and "mime.cache".
> 
> This assumes that implementations read globs linearly and stop at the first match.
> But in KDE I parse this into a hash (that is globally shared among processes, via a file on disk)...
> Hmm, OK, even then I could do this correctly by making it a multihash
> with meaningful order in the list of values for a given key... This does get tricky,
> but I see no other way to handle conflicting globs without magic indeed.

The approach i was thinking of is hash extension to list of mimetypes.
These would then just be ordered by conflict priority.

> > > My problem is that I can't test the subclass case, README* is the only
> > > case of a glob match that has a * but not as the first character, so
> > > it's the only one that can give conflicts...
> > > So after implementing "take longest match", I see no way of testing
> > > "take subclass", since in the case of README.txt it is the longest
> > > match anyway... I could can data, but I also
> > > mean that we might not have a use case for it at the moment :)
> > 
> > I can't think of any case where its needed either, so maybe we should
> > drop that to lower complexity.
> 
> Agreed.
> However I just had a case where "take subclass" might be needed:
> when the *magic* conflicts. Try "<!--foo--><html>bar</html>": this should
> be detected as text/html, but it's detected as application/xml here because
> both mimetypes have       <match value="&lt;!--" type="string" offset="0"/>
> and they have the same priority for that magic rule! (50)
> I believe this is a bug in freedesktop.org.xml, the rules for html should have higher
> priority than the rules for xml, to fix this. But I guess we could also say
> "the xml is fine, we just need to pick the subclass when conflicting magic rules
> match". I wouldn't like that though, since it would make the implementation
> more complex.
> 
> Do you agree with making the magic rules for xml priority 40?

Yes, that sounds good to me.




More information about the xdg mailing list