[systemd-devel] [ANNOUNCE] Journal File Format Documentation

Tue Oct 23 11:40:54 PDT 2012

On Tue, 23.10.12 19:11, Ciprian Dorin Craciun (ciprian.craciun at gmail.com) wrote:

> 
> On Sun, Oct 21, 2012 at 1:05 AM, Lennart Poettering
> <lennart at poettering.net> wrote:
> > Heya,
> >
> > I have now found the time to document the journal file format:
> >
> > http://www.freedesktop.org/wiki/Software/systemd/journal-files
> >
> > Comments welcome!
> 
> 
>     (Replying directly to this as I want to start another "sub-thread"...)
> 
>     I'm currently searching for a logging system that has the
> following feature, which I'm guessing could also be beneficial for
> systemd on larger systems:
>     * I have multiple processes that I want to log individually; by
> multiple I mean about 100+ in total (not necessarily on the same
> system);
>     * moreover these processes are quite dynamic (as in spawn /
> terminate) hourly or daily;
>     * I need to control the retention policy per process not per entire system;
>     * if needed I want to be able to archive these logs in a
> per-process (or process type) basis;
>     * as bonus I would like to be able to "migrate" the logs for a
> particular process to another system;
>     (In case anyone is wondering what I'm describing, it is a PaaS
> logging system similar with Heroku's logplex, etc.)

The journal currently cannot do this for you, but what it already can is
split up the journal per-user. This is done by default only for login
users, (i.e. actual human users), but with the SplitMode= setting in
journald.conf can be enabled for system users as well, or turned off
entirely. We could extend this switch to allow other split-up schemes.

But note that the price you pay for interleaving files on display grows
with the more you split things up (O(n) being n number of files to
interleave), hence we are a bit conservative here, we don't want to push
people towards splitting up things too much, unless they have a really
good reason to.

BTW, are you sure you actually need processes to split up by? Wouldn't
services be more appropriate?

>     The parallel with systemd:
>     * think instead of my processes, of user-sessions and services; (I
> want to keep some service's (like `sshd`) logs for more time than I
> want it for DHCP, etc.);
>     * then think about having a journal collecting journals from
> multiple machines in a central repository;
> 
>     As such, wouldn't a "clustering" key (like service type, or
> service type + pid, etc.) would make sense? This would imply:
>     * splitting storage based on this "clustering key"; (not
> necessarily one per file, but maybe using some consistent hashing
> technique, etc.)
>     * having the clustering key as a parameter for querying to
> restrict index search, etc.

Not sure I grok this.

>     Of course all what I've described in the beginning could be
> "emulated" with the current journal, either by introducing a special
> field, or by using the journal library with multiple files (which I
> haven't checked if it is possible).

In general our recommendation is to write as much as possible into the
journal as payload, and do filtering afterwards rather then
before. i.e. the journal should be the centralization point for things,
where different views enable different uses.

The main reason the current per-user split logic exists is access
control, since by splitting things up in files we can easily use FS ACLs
for this, instead of introducing a centralized arbitration engine that
enforces access rights.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.