[systemd-devel] Fragile journal interleaving

Uoti Urpala uoti.urpala at pp1.inet.fi
Tue Dec 12 19:00:31 UTC 2017


On Tue, 2017-12-12 at 17:09 +0100, Lennart Poettering wrote:
> On Mo, 11.12.17 00:36, Uoti Urpala (uoti.urpala at pp1.inet.fi) wrote:
> > consider a clear bug: there's code in next_beyond_location() which
> > skips the next entry in a file if it's not in the expected direction
> > from the previous globally iterated entry, and this can discard valid
> > entries. A comment there says it's meant to discard duplicate entries
> > which were somehow recorded in multiple journal files (which I'd assume
> > to compare equal), but it also discards non-duplicate entries which
> > compare backwards from the previously shown one.
> 
> Note that two entries will only compare as fully identical if their
> "xor_hash" is equal too. The xor_hash is the XOR combination of the
> hashes of all of the entry's fields. That means realistically only
> records that actually are identical should be considered as such.

I assume that would be suitable for handling the case of actual
duplicates? How do those happen anyway?

 
> > code searching for earlier entries first found the early system
> > journal, then moved to the user journal because it had smaller seqnums,
> > and finally moved to some other file that started 5 days before the end
> > (earlier than the user journal and with a different seqnum id - there
> > didn't happen to be other earlier files later in the iteration order).
> > After printing one entry from there, the next_beyond_location()
> > "duplicate" code then discarded all the earlier valid entries.
> > 
> > I'm not sure about the best way to fix these issues - are there
> > supposed to be any guarantees about interleaving? At the very least the
> > "duplicate" check should be fixed to not throw away arbitrary amounts
> > of valid entries. Any other code relying on assumptions about valid
> > ordering? Is the interleaving order supposed to be stable independent
> > of things like in which order the files are iterated over?
> 
> So, the current "duplicate" check is not just that actually, it's also
> a check that guarantees that we progress monotonically, and never go
> backwards in time, even when journal files are added or removed while
> we iterate through the files.

If journal files can be added while iterating, I assume that could be
handled by special code run when such an event happens. IMO it's
important to have a basic guarantee that all entries in the journal
files will be shown at least in _some_ order when you try to iterate
through them all, such as with plain "journalctl". And thus the code
should make sure not to discard any entry unless it's confirmed to be a
full duplicate.


> I am not entirely sure what we can do here. Maybe this can work: beef
> up the comparison logic so that it returns more than
> smaller/equal/larger but also a special value "ambiguous". And when
> that is returned we don't enforce monotonicity strictly but instead go
> record-by-record, if you follow what I mean?

I don't see how that would help, at least not without some extra
assumptions/changes. In my example problem case above, the ambiguous
comparisons happen when deciding which file to get the first entry
from. There's no natural default "first file", so even if you only know
it's ambiguous you have to pick some anyway. If you pick the one the
current code does, the following discard check is not ambiguous - it's
discarding entries with earlier realtime and non-comparable other
values. Or do you mean that if an ambiguous comparison was EVER seen,
monotonicity would be permanently disabled? I don't really see an
advantage for that over just not enforcing monotonicity at all, and
handling any added-file special cases separately.



More information about the systemd-devel mailing list