[systemd-devel] [PATCH v2] journalctl: Improve boot ID lookup

Jan Janssen medhefgo at web.de
Sat Apr 25 06:51:09 PDT 2015


On 2015-04-08 16:14, Jan Janssen wrote:
>
>
> On 2015-04-08 14:39, Lennart Poettering wrote:
>> On Thu, 02.04.15 17:08, Jan Janssen (medhefgo at web.de) wrote:
>>
>>> This method should greatly improve offset based lookup. We now don't
>>> have
>>> to aggregate the full boot listing just so we can jump to a specific
>>> position,
>>> which can be a real pain on big journals just for a mere "-b -1" case.
>>>
>>> As an additional benefit --list-boots should improve slightly too,
>>> because
>>> we now need to do less seeking.
>>>
>>> Note that there can be a change in boot order with this lookup method
>>> because it will use the order of boots in the journal, not the
>>> realtime stamp
>>> stored in them. That's arguably better, though.
>>>
>>> https://bugs.freedesktop.org/show_bug.cgi?id=72601
>>> ---
>>> Hi,
>>>
>>> today I realized that it would be nice if we could do without the
>>> cursor seeking.
>>> Turns out we can! I could swear that I tested
>>> sd_journal_flush_matches() would
>>> reset our position in the journal. But it seems that
>>> sd_journal_next/previous
>>> will advance just fine from the last position we were in, even after
>>> a flush.
>>>
>>> Though, I would still like someone with better journal internals
>>> knowledge confirm
>>> that this is how it's supposed to work.
>>>
>>> Some testing/timing from others than me would be nice too.
>>
>> Hmm, the patch is hard to read, can you explain what precisely the new
>> algorithm is you propose?
>>
>> Lennart
>>
>
> Yeah, patches like these always do end up looking messy. It's much
> easier to read after applying it.
>
> Well, it jumps from one boot to the next boot using _BOOT_ID matches. It
> starts at the journal head to get the boot ID, makes a _BOOT_ID match
> and then comes from the opposite journal direction (tail) to get the end
> a boot. And then flushes the matches, and advances the journal from that
> exact position one further (which gives us the start and ID of our next
> boot). Rinse and repeat.
> Note, v1 differs in that it assumes sd_journal_flush_matches() will also
> reset the position we are in the journal at that moment. That version
> went around that by using a cursor and seeking to the after flushing.
> Hence why I wonder if this behavior of slush_matches is expected/desired
> or not.
>
> This is much faster for relative boot ID lookups, for the very reason
> that you don't have to look at all boots. Though, it does make the
> assumption that all boots (IDs) are assumed to not interleave
> (constellations like "A B A C" cannot happen), which afaik would be
> satisfied on single host machines.
>
> Later after sending this patch I realized that it could probably break
> on journals with more than one machine ID, since then boot IDs can
> interleave due to them running in parallel, breaking a important
> assumption. Though, I *should* be able to fix that by adding some
> _MACHINE_ID matches in the mix.
>
> Adding machine ID matches would make --list-boots behavior differ quite
> a lot. For one, with this approach, there isn't any global ordering of
> boots across machine IDs. Personally, I find this ordering (although you
> can define it as *a* valid ordering) to be useless. Doing a "journalctl
> -b boodID-1" match, for example, should use that bootID's machine ID to
> get to the previous boot (of that machine). Right now it can get you any
> bootID from any other machine, so long as it was booted right before it.
>
> So yeah, I will make this patch work for journals with more than one
> machine ID if this approach is desired.
>
> Jan

I gave this another look today. Since journalctl uses 
SD_JOURNAL_LOCAL_ONLY by default, the new algorithm cannot trip up on 
interleaving boot IDs (since they shouldn't be interleaving in that 
case, per the above assumption). Same goes for --machine mode. Now, 
--file, --directory and --merge mode on the other hand does confuse the 
new algorithm.

But I think it might be worth it to go with my above suggestion if 
that'll be accepted. Alternatively, we could either refuse --boot and 
--list-boots in those cases, or ship the old algorithm along with the 
new one and use that one in those cases where the faster one gets confused.

Or we stick with status quo and don't improve on the algorithm 
altogether. I'd like to know the option to go with, to ease me mind...

Jan


More information about the systemd-devel mailing list