Case insensitive mimetype matching edge case

David Faure faure at kde.org
Wed Aug 19 12:53:21 PDT 2009


On Wednesday 19 August 2009, Alexander Larsson wrote:
> On Wed, 2009-08-19 at 10:02 +0200, David Faure wrote:
> > On Wednesday 19 August 2009, Alexander Larsson wrote:
> > > Ugh. Additionally we have to extend the mime.cache format more. Maybe
> > > we can solve this with a hack. What about this:
> > >
> > > All case insensitive globs are converted to lower case in the globs
> > > file. Glob lookup is done by first matching the real filename against
> > > the globs, then (on failure) convert the name to lower case and try
> > > again. This will result in a case insensitive match except for things
> > > marked as case sensitive that has at least one uppercase character.
> > >
> > > We can't do case-sensitive matching of only-lowercase globs, but we
> > > don't currently have any example of this in the databases.
> >
> > But I do want to do one of those, to solve bug 22634: I want
> >    <glob pattern="core"/> to be case-sensitive="true".
> >
> > How about a different hack:
> > we generate in globs2 two lines, in case of case-sensitive:
> > 50:text/x-c++src:*.C
> > 50:text/x-c++src:*.C:cs
> > Old parsers will create an entry for "*.C:cs", which will probably never
> > match any real file, so no big deal, while new parsers will take the
> > second line as an indication that the *.C glob (parsed one line above)
> > should be understood to be case sensitive.
>
> Hmmm. I like this one. Sounds good to me. But lets make it extensible
> when we're doing it, i.e. have  a comma-separated list of flags with
> "cs" being one known one. Unknown flags are ignored, anything after
> another : is ignored.

Good idea.
I made the changes in the spec, in the definition of the two mimetypes,
and in update-mime-database.c (for parsing, and globs2 generation).
Please find patch attached (I can commit if you're ok with it).

I included a suggested format change for the mimeinfo.cache file, but I'll
have to let you implement that part, I don't know all the details about the 
suffix tree etc. Same for the xdgmime implementation.

I like that this is going to improve performance, too: no need to do the two-
step glob matching anymore (case insensitive + case sensitive), it will now be 
one -or- the other, for a given glob.

-- 
David Faure, faure at kde.org, sponsored by Qt Software @ Nokia to work on KDE,
Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shared-mime-info-case-sensitive.diff
Type: text/x-patch
Size: 7399 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/xdg/attachments/20090819/6fe80150/attachment.bin 


More information about the xdg mailing list