[systemd-devel] Fragile journal interleaving

Tue Dec 12 22:43:53 UTC 2017

On Tue, 2017-12-12 at 21:38 +0100, Lennart Poettering wrote:
> Maybe the approach needs to be that we immedately increase the read
> record ptr of a specific file by one when we read it, so that we know
> we monotonically progress through the file. And then change the logic
> that looks for the next entry across all files to honour that, and
> then simply skip over fully identical entries, but not insist on
> monotonic timers otherwise.

Isn't this pretty much what the code already does (except for the
"fully identical entries" part)? The next_beyond_location() function
already gives the next entry from a given file, and seems to use the
internal order of the file when doing that. So the only change would be
making the duplicate check only discard actually identical entries. And
now that I checked the surrounding code, it looks like even in the new-
file-added case you mentioned next_beyond_location() would call
find_location_with_matches(), which should seek the new file to the
correct position, so I don't see why the monotonicity discard would be
needed for that case either? 

> With that approach we can be sure that we never enter a loop, and
> we'll most filter out duplicates (well, except if duplicate entries
> show up in multiple files in different orders, but that's OK then,
> dunno).

You could probably make code loop by removing and adding back files
with the right timing, but that's not a very realistic case. So unless
there's some other part of the code I've missed, it looks like just
changing the discard to check for equality rather than non-monotonicity 
would be an improvement. That would still leave issues like the order
being "non-deterministic" (it depends on directory read order for files
for example) at least.