Xesam meta-meta-data spec needs attention.

Evgeny Egorochkin phreedom.stdin at gmail.com
Tue May 1 20:29:46 EEST 2007


On Tuesday 01 May 2007 17:55:26 Mikkel Kamstrup Erlandsen wrote:
> > > $MIN_CARDINALITY
> > >
> > > >   Minimum cardinality. Minimum number of properties of this type you
> >
> > must
> >
> > > > set
> > > > for a given file.
> > > >   Lets specify mandatory properties. Default is 0.
> > >
> > > Is there any example of a mandatory property? Does it even make sense?
> >
> > File name or URI?
>
> I don't see why they have to be mandatory. Not everything comes from a
> file.
>
> In the search API it is specifically avoided to use global identifiers for
> objects - as fx a mandatory uri would be. My opinion is that we shouldn't
> *force* URIs or any mandatory property onto any object.

The intent of this was to make life easier for apps by guaranteeing existence 
of some basic properties, however I do agree that the list would be extremely 
short if not non-existent.

> > > > $INDEXING
> > > >   Values=fulltext, atomic, none, TBD. Default = TBD.
> > >
> > > Can you please describe what these values mean? I think I get it, but
> >
> > let's
> >
> > > be sure :-)
> >
> > I'm not sure myself :)
> > Fulltext is supposed to be indexed for full-text search.
>
> So fulltext is text that should be tokenized, filtered for stop words,
> stemmed, and indexed, right?
>
> > Atomic is for numbers, enum-like fields(controlled vocabulary)
>
>
> Atomics should be indexed as one searchable chunk. No stemming, splitting,
> filtering. Just stick it in the index..?

Exactly.

> What about TBD?

TBD = To Be Determined. Internal doc artefact. I meant the list might need 
extension.

> None is self-explanatory.
>
> Yes - don't index this field. But I take it you can still retrieve the
> value of it... Or else there is no reason defining the field in the first
> place...

Right.

> I'd appreaciate feedback on this. Is it always possible to derive this from
>
> > field type or not?
>
> I don't think you can derive it always. Think of some app that stores some
> unique string ID along side all objects. It might want to be able to search
> for these IDs, but it surely don't want them tokenized just because they
> might contain a space. In this case the app would want to use
> INDEXING=atomic.

Reasonable. I proprose to make atomic the default.

> I've intentionally omitted field properties IsWritable and
>
> > IsIntrinsic(Embedded). The reason for this is we never know in advance.
> > These
> > values should be queried for specific files.
>
> Yes. We discussed this on IRC and ended up agreeing that it belongs as
> methods in a Metadata service API.
>
> This reminds me that we could use an IRC channel somewhere...

You can always come to #strigi to discuss or I'd gladly accept an 
invitation :)

Cheers,
Evgeny



More information about the xdg mailing list