[systemd-devel] Hashmap iteration is too costly

Mon Jul 17 17:33:33 UTC 2017

On Mon, Jul 17, 2017 at 10:12:12AM +0200, Lennart Poettering wrote:
> On Fri, 14.07.17 03:18, vcaputo at pengaru.com (vcaputo at pengaru.com) wrote:
> 
> > The current hashmap iteration incurs at least one function call per
> > iteration, and in my observations using gcc 6.3 w/-g -O2, it's two:
> > 
> >  internal_hashmap_iterate()
> >  hashmap_iterate_entry()
> > 
> > for every element in the hashmap.
> > 
> > In profiles of `journalctl -b --no-pager` having multiple (50) log files,
> > the hashmap iteration shows up as ~10% of the CPU in callgrind
> > output.
> 
> I wonder if it wouldn't be easiest to simply bypass hashmap native
> iteration for this case. i.e. add a linked list for the open journal
> files which we can easily traverse in O(1) time?
> 

What part of hashmap_iterate_in_insertion_order() grows in complexity with
increasing N?

> The hashmap implementation has been optimized to require very little
> memory, because we have so many hashmap/sets around in PID 1. It's not
> optimized to make iteration fast though.
> 

I don't believe these are mutually exclusive.

Regards,
Vito Caputo