[systemd-devel] Antw: [EXT] Re: Output from `tee' is not showing up in system journal consistently on some systems
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Thu Oct 28 07:35:44 UTC 2021
ANother random idea: Did you experiment with tee option "--output-error=..."?
>>> Mitchel Humpherys <mitch.special at gmail.com> schrieb am 27.10.2021 um 18:50
in
Nachricht
<CAOFALwKBop=CA9rvYkUVO-WRbKwkRsarjPPur+EMybGWh57C7w at mail.gmail.com>:
> On Wed, Oct 27, 2021 at 1:52 AM Colin Guthrie <gmane at colin.guthr.ie> wrote:
>>
>> Mitchel Humpherys wrote on 26/10/2021 21:16:
>> > On my Manjaro and Ubuntu systems I'm not seeing all of the "pizza" lines
>> > (logs pasted here [3]). On Debian I see the "pizza" and "pie" lines
>> > interleaved as expected. It also works as expected when I run the script
>> > without systemd on Manjaro.
>> >
>> > Any ideas what's going on here? Why am I not seeing all of the output
>> > from `tee' on some systems?
>> >
>> > [2] https://github.com/mgalgs/systemd-tee-output-issue
>>
>> Looking at the output there, the only thing I can see that's slightly
>> odd is that the "pie" output comes from the service's main pid (as
>> expected) but as tee is forked off the output there that *is* captured
>> is from a different pid each time.
>>
>> So I wonder if this is some kind of cgroup race thing? i.e. the forked
>> off process is not (always) detected as being part of the unit in
>> question and thus isn't logged against the unit? If this is the case is
>> the a bash trick you can use to buffer the output back to the parent
>> process - e.g. just surrounding the command in parenthesis or something?
>> (I'm not a bash expert so no idea if this would work as a workaround!).
>> Appreciate any code change to work around this isn't ideal anyway, but
>> just trying to help debug the underlying problem.
>>
>> This is just a random guess and might not at all be how the logging or
>> cgroup handing works!
>
> Interesting thought! I think it's totally expected for `tee' to have a
> different PID. Your comment got me curious about the cgroup situation
> in the working vs. non-working scenarios, but it does appear that the
> `tee' processes are appropriately cgroup'd. I'm pasting some mid-tee
> `status' output below (captured with `while :; do systemctl --user
> status tee_test.service; done > /tmp/statuses.txt' and some grepping).
>
> First, for reference, here's a mid-tee `status' from my Debian 11
> environment where logging is working as expected:
>
> ● tee_test.service - Test teeing under systemd
> Loaded: loaded
> (/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
> preset: enabled)
> Active: active (running) since Tue 2021-10-26 11:44:07 PDT; 21h ago
> Main PID: 487 (tee_test.sh)
> Tasks: 2 (limit: 4679)
> Memory: 3.2M
> CPU: 5min 49.049s
> CGroup:
> /user.slice/user-1000.slice/user at 1000.service/app.slice/tee_test.service
> ├─ 487 /bin/bash /usr/local/bin/tee_test.sh
> └─229327 tee -a /tmp/testtee.log
>
> Oct 27 08:57:46 pi-hole-test tee_test.sh[229282]: i shall eat 75937
> slices of pizza
> Oct 27 08:57:46 pi-hole-test tee_test.sh[487]: i shall eat 75937 slices of
> pie
> Oct 27 08:57:47 pi-hole-test tee_test.sh[229293]: i shall eat 75938
> slices of pizza
> Oct 27 08:57:47 pi-hole-test tee_test.sh[487]: i shall eat 75938 slices of
> pie
> Oct 27 08:57:48 pi-hole-test tee_test.sh[229304]: i shall eat 75939
> slices of pizza
> Oct 27 08:57:48 pi-hole-test tee_test.sh[487]: i shall eat 75939 slices of
> pie
> Oct 27 08:57:49 pi-hole-test tee_test.sh[229315]: i shall eat 75940
> slices of pizza
> Oct 27 08:57:49 pi-hole-test tee_test.sh[487]: i shall eat 75940 slices of
> pie
> Oct 27 08:57:50 pi-hole-test tee_test.sh[229327]: i shall eat 75941
> slices of pizza
> Oct 27 08:57:50 pi-hole-test tee_test.sh[487]: i shall eat 75941 slices of
> pie
>
> And here are two `status'es from my Manjaro environment, where logging
> isn't working consistently. In the first case the logs were missing in
> the system journal. The second case was one of the odd cases where the
> logs actually came through.
>
> Case 1 (logs missing):
> Status:
> ● tee_test.service - Test teeing under systemd
> Loaded: loaded
> (/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
> preset: enabled)
> Active: active (running) since Wed 2021-10-27 08:55:04 PDT; 15min ago
> Main PID: 3791090 (tee_test.sh)
> Tasks: 2 (limit: 38124)
> Memory: 1.0M
> CPU: 3.874s
> CGroup:
> /user.slice/user-1000.slice/user at 1000.service/app.slice/tee_test.service
> ├─3791090 /bin/bash
> /home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
> └─3841065 tee -a /tmp/testtee.log
>
> Oct 27 09:10:11 grinchel tee_test.sh[3791090]: i shall eat 906 slices of
pie
> Oct 27 09:10:12 grinchel tee_test.sh[3791090]: i shall eat 907 slices of
pie
> Oct 27 09:10:13 grinchel tee_test.sh[3791090]: i shall eat 908 slices of
pie
> Oct 27 09:10:14 grinchel tee_test.sh[3791090]: i shall eat 909 slices of
pie
> Oct 27 09:10:15 grinchel tee_test.sh[3791090]: i shall eat 910 slices of
pie
> Oct 27 09:10:16 grinchel tee_test.sh[3791090]: i shall eat 911 slices of
pie
> Oct 27 09:10:17 grinchel tee_test.sh[3791090]: i shall eat 912 slices of
pie
> Oct 27 09:10:18 grinchel tee_test.sh[3791090]: i shall eat 913 slices of
pie
> Oct 27 09:10:19 grinchel tee_test.sh[3791090]: i shall eat 914 slices of
pie
> Oct 27 09:10:20 grinchel tee_test.sh[3791090]: i shall eat 915 slices of
pie
>
>
> Case 2 (logs present):
> Status:
> ● tee_test.service - Test teeing under systemd
> Loaded: loaded
> (/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
> preset: enabled)
> Active: active (running) since Wed 2021-10-27 08:55:04 PDT; 13min ago
> Main PID: 3791090 (tee_test.sh)
> Tasks: 2 (limit: 38124)
> Memory: 768.0K
> CPU: 3.541s
> CGroup:
> /user.slice/user-1000.slice/user at 1000.service/app.slice/tee_test.service
> ├─3791090 /bin/bash
> /home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
> ├─3836703 /bin/bash
> /home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
> └─3836704 tee -a /tmp/testtee.log
>
> Oct 27 09:08:58 grinchel tee_test.sh[3791090]: i shall eat 833 slices of
pie
> Oct 27 09:08:59 grinchel tee_test.sh[3791090]: i shall eat 834 slices of
pie
> Oct 27 09:09:00 grinchel tee_test.sh[3836242]: i shall eat 835 slices of
> pizza
> Oct 27 09:09:00 grinchel tee_test.sh[3791090]: i shall eat 835 slices of
pie
> Oct 27 09:09:01 grinchel tee_test.sh[3836366]: i shall eat 836 slices of
> pizza
> Oct 27 09:09:01 grinchel tee_test.sh[3791090]: i shall eat 836 slices of
pie
> Oct 27 09:09:02 grinchel tee_test.sh[3791090]: i shall eat 837 slices of
pie
> Oct 27 09:09:03 grinchel tee_test.sh[3791090]: i shall eat 838 slices of
pie
> Oct 27 09:09:04 grinchel tee_test.sh[3836704]: i shall eat 839 slices of
> pizza
> Oct 27 09:09:04 grinchel tee_test.sh[3791090]: i shall eat 839 slices of
pie
>
> The extra /bin/bash process in the working Manjaro case is kind of
> interesting... This must be the pipe which spawns the `tee' (note the
> PID adjacency), although I'm not quite sure why the process is still
> there after the `tee' is already live (I assume it uses execv* and
> thus would have expected it to be gone by the time `tee' comes up).
>
> In any case, I have a hunch that this may have been simple lucky
> timing where I caught the pipe (especially since the working Debian 11
> case doesn't have the additional process), but I haven't managed to
> catch the pipe in any of the non-working cases on Manjaro thus far...
> :thinking_face:
More information about the systemd-devel
mailing list