[Xesam] Wrapping up for Xesam Search Spec RC3

Michael Albinus michael.albinus at gmx.de
Wed Aug 20 08:24:44 PDT 2008


"Mikkel Kamstrup Erlandsen" <mikkel.kamstrup at gmail.com> writes:

Hi,

> Let me first say that I/we are planning a workshop to review and
> document the ontology on the hackfest coming up in September. This is
> post RC3 however, so you comments should defitinely be be handled
> before that. My general opinion is that all descriptions needs to be a
> lot more elaborate than just a few words, like they are now.

Maybe you can tell me when you plan this review session? If you don't
object, I could participate in this session (I live at Berlin :-)

>> - Sometimes, xesam:summary or xesam:snippet return "highlighted" text
>>  (hits enclosed by ..., for example). Is it possible to get an
>>  indication for this? It influences, how the summary (or snippet) is
>>  visualized by the Xesam client.
>
> xesam:summary will contain a pregenerated summary of the text. Either
> by extracting it from a metadata field inside the file or by
> extracting it from a some chunk of text inside the file. I don't know
> if we should set any standard for the contents of this. Plain UTF-8
> probably. xesam:snippet is another matter. It is always generated on
> the fly, and highlights the matching search terms if the engine knows
> how to do that.

I've taken the xesam:summary example from your xesam-yahoo-service
script (of xesam-tools). The summary contains highlighted hits by
HTML's bold tag.  So it isn't plain UTF-8, but marked-up text.
Something, a client shall know.

>> - Definitely for post-1.0-release: I miss attributes, describing hits
>>  in a bug database, like Debian BTS, Bugzilla, ...
>
> What do we really miss apart from a xesam:Bug content category?

I have written a small search engine accessing Debian's BTS. I use
xesam:keyword, xesam:owner, xesam:title, xesam:url, xesam:mimeType,
and xesam:sourceModified. Additionally, I have an own ontology for

debbugs:foundDate -- date when the bug has been found
dateTime; maybe xesam:sourceCreated could be used.

debbugs:fixedDate -- date when the bug has been closed
dateTime

debbugs:foundVersions -- software versions the bug has been reported for
list of strings

debbugs:fixedVersions -- software versions the bug is fixed with
list of strings

debbugs:package -- name of the software package
string (like "xesam", "emacs")

debbugs:status -- status of bug fixing
string (like "pending", "fixed", "moreinfo", "notabug",
"unreproducible", "wontfix")

debbugs:originator -- name and email address of submitter
string

debbugs:severity -- severity of the bug
enumeration (important, normal, minor, wishlist); alternatively, it
could also be an integer

I won't claim that it is a complete bug ontology. These are just the
fields I found useful in the given case.

Best regards, Michael.



More information about the Xesam mailing list