[systemd-devel] Python journal reader

Steven Hiscocks steven-systemd at hiscocks.me.uk
Sun Apr 14 07:28:31 PDT 2013


On 14/04/13 03:36, David Strauss wrote:
> I keep writing lengthy emails about how we can use this as an
> opportunity to reduce redundancy and improve consistency, but I should
> probably ping you on #systemd IRC to hash it out. I can't think of
> anything elegant that doesn't involve altering the existing journal.py
> or _reader.c code.
>
> What's your handle? Mine's davidstrauss.
>
> As a starter, I want to enumerate the places where
> __REALTIME_TIMESTAMP has special handling:
>   (1) Existing code: A C-based member function in _reader.c.
>   (2) Existing code: A native Python conversion function in journal.py.
>   (3) Existing code: An entry in DEFAULT_CONVERTERS.
>   (4) Added in your patch: A key setting in get_next().
>   (5) Added in your patch: A Python-based member function in journal.py
> that overrides (1).
>   (6) Added in your patch: A condition in get() that invokes (1) when
> the specific field is requested.
>
> While I want to fix this bug, I also don't want six scattered special
> cases for __REALTIME_TIMESTAMP and similar fields.
>

Thanks for the chat earlier on IRC David, especially given your 
localtime :-).

I thought about the points discussed: keeping things similar to 
journalctl -o json output (i.e. include special fields); get, get_next, 
and get_realtime, etc methods create duplication and multiple ways to 
get the same/some values (could cause confusion); issue with `next` 
clashing with python2 __iter__ next; trying to stabilise the interface 
(note: not declared stable yet); performance issues with unnecessary 
dictionary field conversions.

I've therefore created another patch for consideration:
https://gist.github.com/kwirk/5382783

I've removed the get_next and get_previous methods from _Reader, as 
these didn't align with the C API. This is replaced with:`_get_all` 
private method (effectively a wrapper around C API macro 
sd_journal_{enumerate,restart}_data/SD_JOURNAL_FOREACH_DATA); renamed 
`next` method to `_next`, such to avoid name clash (and I believe should 
be private); `_previous` method for consistency.

The `get_next` and `get_previous` methods are now in Reader only. These 
call `_next` method, followed by `_get_all` and then fetch the special 
fields, before passing whole dictionary to converters. This makes all 
the traversal, and getting standard and special fields from the journal 
transparent to the user.

With the changes of removing `get_next` in _Reader, the iter methods 
don't function as you'd expect. To that end, I've moved them to Reader 
with get_next as iter next call as before.

I understand the points about performance, but don't think the 
performance hit is that bad, or critical in most use cases. These 
changes should allow a custom class to inherit _Reader, and the create a 
more optimised version if that is required. (One option if this is a 
major issue, is a custom dictionary class could be returned by 
`get_next`, which uses the __get__ method to lazy convert the fields on 
access. This is starting to get complicated thought...)

Hopefully I've covered everything. Feedback welcomed :-)
-- 
Steven Hiscocks


More information about the systemd-devel mailing list