extended attribute standardization

Sun Nov 19 17:55:25 EET 2006

Jos van den Oever wrote:
> 2006/11/19, Michael Burschik <Michael.Burschik at gmx.de>:
>> Doesn't this require some kind of parser plugin for each and every type
>> of file you intend to index? So instead of duplicating metadata, you are
>> duplicating parsers? Or are you able to reuse the parser of the
>> application that actually wrote the file? If you want to query a large
>> number of files, then speed will be of considerable importance. Parsing
>> complex files will certainly take a lot longer than reading extended
>> attributes.
> Yes it does require such parsers. But since many apps share file
> formats this is less of a problem than it seems. Sharing code between
> app and parser would be great but is not required.
> If you want to store the extracted information, it's better to store
> it in an index instead of extended attributes. Storing in an index
> enables searching and allows for easier management of the data taken
> up by the duplicated data.
Do you mean some kind of relational database? If not, I'm not so sure 
that searching the index would be faster than searching extended 
attributes. After all, the file system already is a very efficient 
database, even though it is not optimized for the kind of access 
patterns we are talking about.
>
>> In an ideal world, the application would save the metadata in some
>> easily accessible format and place whenever the file is modified. In
>> this context, I would hesitate to call ID3 or EXIF tags easily
>> accessible, for example.
> There are plenty of libs to get that data out. Also the specs are 
> rather simple.
>
> Cheers,
> Jos
>
The individual specs may be simple, but the sheer number of specs will 
certainly introduce complexity.

Regards

Michael Burschik