file(1) / magic(5) database vs shared-mime-info database

Kip Warner kip at thevertigo.com
Sun Feb 25 22:13:19 UTC 2018


On Sat, 2018-02-24 at 14:17 +0100, Bastien Nocera wrote:
> No, it wouldn't make sense. They have very different use cases and
> restrictions. In shared-mime-info:
> - mime-types don't have to have a magic associated

True, but I believe file(1)'s magic(5) database doesn't need to either.
It can "detect" a MIME type based on just a file extension if that's
what the rule writer provided.

> - and mime-types have globs associated
> - the magic length is limited to avoid seeking through huge files

The magic(5) can also handle this by having rules that specify specific
file offsets to check.

> - descriptions are translated, acronyms can be split-off and expanded

Yes, that's a good point.

> - mime-types have inheritance

Another good point.

> There's probably others, but that's already a good chunk of the
> problems we'd encounter if we used file's database.

One capability file(1)'s magic(5) method has that nobody has mentioned
is the ability to identify not only the MIME type, but also a more
descriptive comment on the file's contents. As an example, consider the
magic I wrote to detect Maxis Database Packed Files.

    https://github.com/file/file/blob/master/magic/Magdir/dbpf

If I just want to know the MIME type:

    $ file --mime-type SimCity_Audio_Banks.package
    SimCity_Audio_Banks.package: application/x-maxis-dbpf

But if I want to see a more descriptive comment:

    $ file SimCity_Audio_Banks.package
    SimCity_Audio_Banks.package: Maxis Database Packed File, version: 3.0, files: 83

I think what I've started to figure out is file(1) / magic(5) are meant
to be used directly by users as well as the API for magic(5) by other
applications. In the case of shared-mime-info it's designed to be used
primarily by other applications. It's rare users try to identify a file
 by 'gio info foo'.

That might make these two mechanisms justify their own distinct
existence, but something that I think should be done at the least is
consolidate the redundant magic itself.

-- 
Kip Warner | Senior Software Engineer
OpenPGP signed/encrypted mail preferred
https://www.cartesiantheatre.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/xdg/attachments/20180225/3448557c/attachment.sig>


More information about the xdg mailing list