[systemd-devel] RFC: filter and search journalctl
Goffredo Baroncelli
kreijack at libero.it
Sat Aug 8 12:13:42 PDT 2015
On 2015-08-07 11:53, Sebastian Schindler wrote:
> Hi all.
>
> The journal format offers powerful filter capabilities. Unfortunately this power
> is lost, if you have to use grep to find certain information.
> Example given (unscientific benchmark), count the number of entries for a
> (known) executable:
>
>
> $> journalctl --disk-usage
> Archived and active journals take up 344.1M on disk.
>
> $> $ time (journalctl _EXE=/usr/sbin/dhclient -o verbose | \
> grep -F _EXE=/usr/sbin/dhclient | wc -l)
> 1233
>
> real 0m0.111s
> user 0m0.007s
> sys 0m0.091s
>
>
> $> $ time (journalctl -o verbose | grep -F _EXE=/usr/sbin/dhclient | wc -l)
> 1233
>
> real 0m7.515s
> user 0m5.088s
> sys 0m6.896s
>
>
> This shows that using grep-piping is magnitudes slower than journalctl.
This is due to the fact that the journal file is structured like a database; all fields are fully indexed, so journalctl is faster in case of a query like KEY=VALUE.
For other kind of search (by regexp, or only by value), journalctl cannot use the indexes so it is a lot slower because it has to process all the journal log.
I am curious if you do
$ time ( grep -F _EXE=/usr/sbin/dhclient /var/log/journal/*/*| wc -l)
which is the time resulting
>
> Grep-ing seems to be the only solution to find log entries if you don't fully
> know what you're looking for. For example: You want to see all entries
> containing a certain MESSAGE that gets enriched with additional information
> during the logging process:
>
> MESSAGE=host <HOST> has closed connection <CONNECTION_ID>
>
> At the moment you have no option to look for this kind of information unless
> someone has set something like MESSAGE_ID you can filter for. There are several
> use cases using this pattern of thinking:
>
> * there's no option to show all set FIELD keys in the current journal, although
> this information is encoded in the header of each journal file
> * there's no support for negated filtering, you can't easily hide output of a
> certain unit which is creating too much noise
> * there's no support for regular expressions (except for the --unit option),
> this is especially problematic when you're looking for certain MESSAGEs
> * there's no option to show all entries containing a certain field
> * logical expressions are somewhat hard to read/write because parenthesis can't
> be used to enforce certain logical expressions
>
> What do you think about a query language for journalctl that allows more
> powerful search options? This could be introduced without ignoring the
> capabilities the journal file format has to offer. Are there maybe already plans
> to introduce something alike into journalctl? Do some people here have
> experience with query languages for such a use case? Things come to mind like
> PCAP filter, SPARQL, Lucene or the SPHINX Query Language.
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
More information about the systemd-devel
mailing list