Xesam meta-meta-data spec needs attention.
Evgeny Egorochkin
phreedom.stdin at gmail.com
Tue May 1 20:29:46 EEST 2007
On Tuesday 01 May 2007 17:55:26 Mikkel Kamstrup Erlandsen wrote:
> > > $MIN_CARDINALITY
> > >
> > > > Minimum cardinality. Minimum number of properties of this type you
> >
> > must
> >
> > > > set
> > > > for a given file.
> > > > Lets specify mandatory properties. Default is 0.
> > >
> > > Is there any example of a mandatory property? Does it even make sense?
> >
> > File name or URI?
>
> I don't see why they have to be mandatory. Not everything comes from a
> file.
>
> In the search API it is specifically avoided to use global identifiers for
> objects - as fx a mandatory uri would be. My opinion is that we shouldn't
> *force* URIs or any mandatory property onto any object.
The intent of this was to make life easier for apps by guaranteeing existence
of some basic properties, however I do agree that the list would be extremely
short if not non-existent.
> > > > $INDEXING
> > > > Values=fulltext, atomic, none, TBD. Default = TBD.
> > >
> > > Can you please describe what these values mean? I think I get it, but
> >
> > let's
> >
> > > be sure :-)
> >
> > I'm not sure myself :)
> > Fulltext is supposed to be indexed for full-text search.
>
> So fulltext is text that should be tokenized, filtered for stop words,
> stemmed, and indexed, right?
>
> > Atomic is for numbers, enum-like fields(controlled vocabulary)
>
>
> Atomics should be indexed as one searchable chunk. No stemming, splitting,
> filtering. Just stick it in the index..?
Exactly.
> What about TBD?
TBD = To Be Determined. Internal doc artefact. I meant the list might need
extension.
> None is self-explanatory.
>
> Yes - don't index this field. But I take it you can still retrieve the
> value of it... Or else there is no reason defining the field in the first
> place...
Right.
> I'd appreaciate feedback on this. Is it always possible to derive this from
>
> > field type or not?
>
> I don't think you can derive it always. Think of some app that stores some
> unique string ID along side all objects. It might want to be able to search
> for these IDs, but it surely don't want them tokenized just because they
> might contain a space. In this case the app would want to use
> INDEXING=atomic.
Reasonable. I proprose to make atomic the default.
> I've intentionally omitted field properties IsWritable and
>
> > IsIntrinsic(Embedded). The reason for this is we never know in advance.
> > These
> > values should be queried for specific files.
>
> Yes. We discussed this on IRC and ended up agreeing that it belongs as
> methods in a Metadata service API.
>
> This reminds me that we could use an IRC channel somewhere...
You can always come to #strigi to discuss or I'd gladly accept an
invitation :)
Cheers,
Evgeny
More information about the xdg
mailing list