[systemd-devel] How to debug occasional hashmap corruption?

Wed Nov 14 10:43:28 UTC 2018

juice kirjoitti 2018-11-06 14:30:
> Lennart Poettering kirjoitti 2018-11-06 12:27:
>> On Di, 06.11.18 11:57, juice (juice at swagman.org) wrote:
>> 
>>> 
>>> Hi,
>>> 
>>> During the past half year I have seen systemd dump core three times 
>>> due
>>> to what I suspect a hashmap corruption or race.
>>> Each time it looks a bit different and is triggered by different 
>>> things
>>> but it somehow centers on hashmap operations.
>>> 
>>> What would be the prefered way to debug this? I cannot add huge 
>>> logging
>>> as this is something that happens once in a blue moon and always in
>>> different compute nodes.
>>> Is there some way I could easily test it by increasing the chance of 
>>> such
>>> corruption/race happening?
>> 
>> This looks very much like a memory corruption of some sorts and
>> valgrind should be the tool of choice to track that down.
>> 
>> Lennart
> 
> Thanks tor the prompt reply, Lennart.
> 
> I agree; using valgrind indeed was something already considered, 
> however I
> suspect it might add some overhead in systemd operation?

I have been trying to start systemd under valgrind but seems it is not a 
trivial
task to do. Moreover, no searching has revealed a general receipe for 
doing that
other than the advice in systemd README's to compile with 
-Dvalgrind=true option.

So, where could I find information on how to set up memory corruption 
debug on
a live system for testing?

  - juice -