[systemd-devel] Support for large applications
Avi Kivity
avi at scylladb.com
Wed Feb 17 18:47:33 UTC 2016
On 02/17/2016 03:56 PM, Zbigniew Jędrzejewski-Szmek wrote:
> On Wed, Feb 17, 2016 at 02:35:55PM +0200, Avi Kivity wrote:
>> We are using systemd to supervise our NoSQL database and are
>> generally happy.
>>
>> A few things will help even more:
>>
>> 1. log core dumps immediately rather than after the dump completes
>>
>> A database will often consume all memory on the machine; dumping
>> 120GB can take a lot of time, especially if compression is enabled.
>> As the situation is now, there is a period of time where it is
>> impossible to know what is happening.
>>
>> (I saw that 229 improves core dumps, but did not see this specifically)
> The coredump is logged afterwards because that's the only way to
> include all information (including the compressed file name) in one
> log message.
Maybe we can log two messages if we can detect that the core is very
large or if we detect that it will take more that a couple of seconds to
store it.
> But there are two changes which might mitigate the problem:
> - semi-recently we switched to lz4, which compresses significantly faster,
> have you tried that?
I think I haven't yet, but consider that memory sizes are growing
rapidly (e.g. byte-addressable non-volatile memory), core counts are
large; I don't think improvements in compression can catch up to this.
>
> - recently the responsibility of writing core dumps was split out to
> a service. I'm not sure how that influences the time when the log
> message is written.
I'll try it out; may take some time because I don't want to upgrade my
large machines for F24 yet.
btw I hope that with this change the service is only restarted after the
dump is complete, or oom is likely.
>
>> 2. parallel compression of core dumps
>>
>> As well as consuming all of memory, we also consume all cpus. Once
>> we dump core we may as well use those cores for compressing the huge
>> dump.
> This should be implemented in the compression library. The compressor
> does not seem to be threaded, but it was we would try to make use of it.
> OTOH, single-threaded lz4 is able to produce ~500MB/s of compressed
> output, so you'd need a really fast disk to go above that.
I happen to have a really fast disk, reaching 4X that, and this is
common for database users.
>
>> 3. watchdog during startup
>>
>> Sometimes we need to perform expensive operations during startup
>> (log replay, rebuild from network replica) before we can start
>> serving. Rather than configure a huge start timeout, I'd prefer to
>> have the service report progress to systemd so that it knows that
>> startup is still in progress.
> Zbyszek
More information about the systemd-devel
mailing list