[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork
vcaputo at pengaru.com
vcaputo at pengaru.com
Fri Jul 7 20:14:39 UTC 2017
In doing some casual journalctl profiling and stracing, it became apparent
that `journalctl -b --no-pager` runs across a significant quantity of logs,
~10% of the time was thrown away on getpid() calls due to commmit a65f06b.
As-is:
# time ./journalctl -b --no-pager > /dev/null
real 0m11.033s
user 0m10.084s
sys 0m0.943s
After changing journal_pid_changed() to simply return 1:
# time ./journalctl -b --no-pager > /dev/null
real 0m9.641s
user 0m9.449s
sys 0m0.191s
More system time is being expended in repeated getpid() calls than write(),
see the strace:
12:51:56.939287 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57 <0.001276>
12:51:56.940633 getpid() = 10713 <0.000032>
12:51:56.940732 getpid() = 10713 <0.000012>
12:51:56.940801 getpid() = 10713 <0.000032>
12:51:56.940867 getpid() = 10713 <0.000041>
12:51:56.940942 getpid() = 10713 <0.000041>
12:51:56.941047 getpid() = 10713 <0.000012>
12:51:56.941117 getpid() = 10713 <0.000012>
12:51:56.941185 getpid() = 10713 <0.000011>
12:51:56.941253 getpid() = 10713 <0.000011>
12:51:56.941320 getpid() = 10713 <0.000039>
12:51:56.941395 getpid() = 10713 <0.000041>
12:51:56.941494 getpid() = 10713 <0.000011>
12:51:56.941561 getpid() = 10713 <0.000012>
12:51:56.941629 getpid() = 10713 <0.000039>
12:51:56.942942 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57 <0.000058>
12:51:56.943052 getpid() = 10713 <0.000039>
12:51:56.943156 getpid() = 10713 <0.000017>
12:51:56.943230 getpid() = 10713 <0.000018>
12:51:56.943305 getpid() = 10713 <0.000012>
12:51:56.943374 getpid() = 10713 <0.000017>
12:51:56.943449 getpid() = 10713 <0.000011>
12:51:56.943517 getpid() = 10713 <0.000011>
12:51:56.943585 getpid() = 10713 <0.000011>
12:51:56.943652 getpid() = 10713 <0.000011>
12:51:56.943721 getpid() = 10713 <0.000030>
12:51:56.943796 getpid() = 10713 <0.000041>
12:51:56.943870 getpid() = 10713 <0.000041>
12:51:56.943944 getpid() = 10713 <0.000041>
12:51:56.944061 getpid() = 10713 <0.001334>
12:51:56.945459 write(1, "Jul 07 09:25:23 x61s unknown"..., 56) = 56 <0.000018>
12:51:56.945544 getpid() = 10713 <0.000017>
12:51:56.945620 getpid() = 10713 <0.000017>
12:51:56.945694 getpid() = 10713 <0.000012>
12:51:56.945763 getpid() = 10713 <0.000011>
12:51:56.945832 getpid() = 10713 <0.000012>
12:51:56.945901 getpid() = 10713 <0.000011>
12:51:56.945969 getpid() = 10713 <0.000011>
12:51:56.946048 getpid() = 10713 <0.000013>
12:51:56.946118 getpid() = 10713 <0.000024>
12:51:56.946188 getpid() = 10713 <0.000047>
12:51:56.946277 getpid() = 10713 <0.000041>
12:51:56.946353 getpid() = 10713 <0.000041>
12:51:56.946428 getpid() = 10713 <0.000040>
12:51:56.946539 getpid() = 10713 <0.001363>
As this is public sd-journal API, it's somewhat set in stone. However,
there's nothing preventing the systemd-internal tooling from linking with
a less defensive/conformant underlying implementation shared with the public
API implementation where these kinds of overheads can be preserved.
For the curious; the logs being processed for this boot are 48 * 8MiB on SSD,
1.8Ghz Core2 Duo, 4.12 kernel.
Regards,
Vito Caputo
More information about the systemd-devel
mailing list