[systemd-devel] RFC: filter and search journalctl

Mon Aug 17 08:34:10 PDT 2015

----- Original Message -----
> From: "Zbigniew Jędrzejewski-Szmek" <zbyszek at in.waw.pl>
> To: "Anne Mulhern" <amulhern at redhat.com>
> Cc: systemd-devel at lists.freedesktop.org, "Sebastian Schindler" <sebastian.schindler at travelping.com>
> Sent: Monday, August 17, 2015 10:45:11 AM
> Subject: Re: [systemd-devel] RFC: filter and search journalctl
> 
> On Mon, Aug 17, 2015 at 10:24:22AM -0400, Anne Mulhern wrote:
> > 
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Zbigniew Jędrzejewski-Szmek" <zbyszek at in.waw.pl>
> > > To: "Sebastian Schindler" <sebastian.schindler at travelping.com>
> > > Cc: systemd-devel at lists.freedesktop.org
> > > Sent: Saturday, August 8, 2015 3:48:30 PM
> > > Subject: Re: [systemd-devel] RFC: filter and search journalctl
> > > 
> > > On Fri, Aug 07, 2015 at 11:53:13AM +0200, Sebastian Schindler wrote:
> > > > Grep-ing seems to be the only solution to find log entries if you don't
> > > > fully
> > > > know what you're looking for. For example: You want to see all entries
> > > > containing a certain MESSAGE that gets enriched with additional
> > > > information
> > > > during the logging process:
> > > > 
> > > > MESSAGE=host <HOST> has closed connection <CONNECTION_ID>
> > > This is a bit contentious, but at least I would like to see some
> > > grep functionality implemented directly in journalctl.
> > > 
> > 
> > I am late to the party, but I think it is obvious that the "right" way for
> > this
> > to be achieved, in a perfect world, is that this log entry be accompanied
> > by a MESSAGE_ID, and HOST and CONNECTION_ID keys, and a catalog entry that
> > combined
> > with the keys, generates the above message so that grepping is entirely
> > unnecessary.
> > 
> > It is true that this perfect world is not just around the corner, or
> > anything like that,
> > but it is technically possible.
> > 
> > I agree that grepping would be handy for me, right now, for just the
> > reasons stated
> > in the original message.
> > 
> > I wonder if it would be reasonable for journalctl to supply the
> > (additional) fields that are
> > guaranteed to be associated with a MESSAGE_ID
> And what what happen when the entry is "malformed", i.e. missing some fields?
> Would journald reject the message? I don't think this would be useful to
> anyone at all. Instead the readers of the message should gracefully adapt
> to missing fields.
> 

I think it would be wrong for journald to reject a message that does not supply
all the declared fields. It would also be wrong for journalctl to crash when given the
--catalog flag if the fields are missing. I don't know what it does right
now, because it is not that easy a situation to engineer, AFAICT. I guess the
best thing would be to supply a special catalog message indicating that an
error had occurred when trying to construct a catalog message. Something
that indicated the fields that were missing that caused the error would be nice.
Just so long as that didn't turn into an infinite loop, somehow. If somebody
knows what journalctl does do in this situation, please pass that information along.

Other consumers of log entries should behave in whatever manner seems best to them
if a declared field is missing.

What I'm looking for here is the best way for an application which wants to use the
journaling facilities provided to publish useful information about its log entry API. The
advantage of publishing it in the manner I've suggested is that journalctl could
be very helpful about telling consumers of the journal what keys they should expect
to see. Something like:
journalctl --list-keys <MESSAGE_ID>
and maybe even a journal API for programmatic access to this information would be
very nice.

Of course, there are other ways for an application to publish its log entry API.
But, it does seem odd for it to do this outside the structures that systemd has
already set up, when it is an API for journal entries.

Since this really is an API, with all the usual issues about versioning and so forth,
it really is essential that the information be published somewhere, not laboriously extracted
from a scan of the code by potential log entry consumers. 

> ...
> > Is it reasonable to preface any MESSAGE_ID
> > specific keys with the MESSAGE_ID, e.g.,
> > "9bb33380-fbfa-4d5b-88b5-6e6bb8a39124:KEY"? Or perhaps a double underscore,
> > e.g.,
> > "__KEY" would do the trick?
> MESSAGE_ID is a contrace between the writers of the message and the readers
> of
> the message. The first say: messages with this ID mean ... and have have the
> fields ... . There is no need to mark the fields in any other way,
> except by documentation or custom.
> 
> Zbyszek
> 

The reason this seems important to me is the problem of a shared namespace.
These MESSAGE_ID UUIDs are "globally" registered, since there is a high enough probability
that every UUID is different that they are, to all intents and purposes, unique.
But the keys do not have this advantage. In this shared namespace, it would
be easy enough for journald to "steal" a key that was already in use by another
application. This would generate all the obvious and usual problems, most probably
forcing the non-journald application to change its log entry API in response.
I'm wondering if there is a way to just avoid this problem by establishing some sort
of convention at the start. Either suggestion that I made above would have a chance of
working, although incorporating the message id value into the key seems more robust.

- mulhern