[systemd-devel] [RFC/PATCH] journal over the network

Mon Nov 19 17:21:54 PST 2012

On Mon, 19.11.12 01:21, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:

Heya,

I like your work!

> The program (called systemd-journal-remoted now, but I'd be happy to
> hear suggestions for a better name) listens on sockets (either from

Since this is also useful when run on the command line I'd really prefer
to drop the "d" suffix, i.e. "systemd-journal-remote" sounds like a good
name for it.

> socket activation, or specified on the command line with --listen=),
> or reads stdin (if given --stdin), or uses curl to receive events from
> a systemd-journal-gatewayd instance (with --url=). So it can be used
> a server, or as a standalone binary.

What precisely does --listen= speak?

My intention was to speak only HTTP for all of this, so that we can
nicely work through firewalls.

> Messages must be in the export format. They are parsed and stored
> into a journal file. The journal file is /var/log/journal/external-*.journal
> by default, but this can be overridden by commandline options
> (--output).

Sounds good!

I think it would make sense to drop things into
/var/log/journal/<hostname>/*.journal by default. The hostname would
have to be determined from the URL the user specified on the command
line. Ideally we'd use the machine ID here, but since the machine ID is
hardly something the user should specify on the command line (and we
cannot just take the machine ID supplied form the other side, because we
probably should not trust that and hence allow it to tell us to
overwrite another hosts' data), the hostname is the next best
thing. Currently libsystemd-journald will ignore directories that are
not machine IDs when browsing, but we could easily drop that limitation.

> Push mode is not implemented... (but it would be a separate program
> anyway).

My intention was actually to keep this in the same tool. So that we'd
have for input and output:

A) HTTP GET
B) HTTP POST
C) SSH PULL (would invoke "journalctl -o export" via ssh)
D) SSH PUSH (would invoke systemd-journald-remote via ssh)
E) A directory for direct read access (which would allows us to merge multiplefile into one with this tool)
F) A directory for direct write access (which is of course the default)

We should always require that either E or F is used, but in any
combination with any of the others.

> Examples:
>   journalctl -o export | systemd-journal-remoted --stdin -o /tmp/dir/

Sounds pretty cool. Pretty close to what I'd have in mind.

To make this even shorter I'd suggest though that we take two normal
args for source and dest, and that "-" is used as stdin/stdout
respectively, and the dest can be ommited:

Hence:
        journalctl -o export | systemd-journal-remote - /tmp/dir
Or:
        systemd-journal-remote http://some.host:19531/entries?boot
Or:
        systemd-journal-remote http://some.host:19531/entries?boot /tmp/dir
Or:
        systemd-journal-remote /var/log/journal /tmp/dir

And so on...

>   remote-127.0.0.1~2000.journal
>   remote-multiple.journal
>   remote-stdin.journal
>   remote-http~~~some~host~19531~entries.journal
> 
> The goal was to have names containing the port number, so that it is
> possible to run multiple instances without conflict.

I'd always try to separate the "base name" out of a host spec. I.e. the
actual hostname of it. So that people can swap protocols as they
wish.

For example, i'd envision that people often begin with just pulling
things via SSH, but later on end up using HTTP more frequently, and
hence this should write to the same dir in /var/log/journal by default:

systemd-journal-remote lennart at somehost
systemd-journal-remote http://somehost:19531/entries?boot

Hmm, also, thinking about it I think we should only use the "base" URL
for the HTTP transport, and let the "/entries?boot" stuff be an
implementation detail we implicitly append.

> static int spawn_curl(char* url) {
>         int r;
>         char argv0[] = "curl";
>         char argv1[] = "-HAccept: application/vnd.fdo.journal";
>         char argv2[] = "--silent";
>         char argv3[] = "--show-error";
>         char* argv[] = {argv0, argv1, argv2, argv3, url, NULL};
> 
>         r = spawn_child("curl", argv);
>         if (r < 0)
>                 log_error("Failed to spawn curl: %m");
>         return r;
> }

My intention here was to use libneon, which is quite OK as HTTP client
library, and includes proxy support, and TLS and whatnon. 

I am a bit conservative about pulling curl into this low level tool
(after all it includes a full gopher client!). I also want to be very
careful to only support HTTP, SSH and "file" as transports, and not any
random FTP or whatnot people might want to throw at this.

Otherwise looks pretty OK! Good work!

Lennart

-- 
Lennart Poettering - Red Hat, Inc.