[systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

Mon Jan 12 13:43:08 PST 2015

Steve Dickson wrote on 12/01/15 20:54:
> Hello
> 
> On 01/12/2015 05:34 AM, Colin Guthrie wrote:
>> Hi,
>>
>> Looking into a thoroughly broken nfs-utils package here I noticed a
>> quirk in systemctl status and in umount behaviour.
>>
>>
>> In latest nfs-utils there is a helper binary shipped upstream called
>> /usr/sbin/start-statd (I'll send a separate mail talking about this
>> infrastructure with subject: "Running system services required for
>> certain filesystems")
>>
>> It sets the PATH to "/sbin:/usr/sbin" then tries to run systemctl
>> (something that is already broken here as systemctl is in bin, not sbin)
>> to start "statd.service" (again this seems to be broken as the unit
>> appears to be called nfs-statd.service upstream... go figure).
                        ^^^ <-- I typod... this should be rpc-statd...

> The PATH problem has been fixed in the latest nfs-utils.  

Cool :)

>> Either way we call the service nfs-lock.service here (for legacy reasons).
> With the latest nfs-utils rpc-statd.service is now called from start-statd
> But yes, I did symbolically nfs-lock.service to rpc-statd.service when 
> I moved to the upstream systemd scripts.

No worries. Looks fine now upstream.

But FWIW, your check for whether systemctl is installed via calling
"systemctl --help" is IMO not very neat.

If you're using bash here anyway, you might as well just do a:

if [ -d /sys/fs/cgroup/systemd ]; then

type check or if you want to be super sure you could do:

if mountpoint -q /sys/fs/cgroup/systemd; then

This is a simple trick to detect if systemd is running. If systemctl is
then not found, then I'd be pretty surprised (but your code would
continue if the call fails anyway, so this should be safe).

This avoids one more fork.

Technicaly you could avoid calling "systemctl start" by calling
"systemctl is-active" first, but to be honest this isn't really needed.

>> If this command fails (which it does for us for two reasons) it runs
>> "rpc.statd --no-notify" directly. This binary then run in the context of
>> the .mount unit and thus in the .mount cgroup.
> What are the two reason rpc.statd --no-notify fails? 

Sorry, perhaps I wasn't clear, it was the call to systemctl that failed
because

1) The PATH was wrong
and
2) The unit name would be wrong even if systemctl binary was found.

These both seem fine in git now, so that's all good.

>> That seems to work OK (from a practical perspective things worked OK and
>> I got my mount) but are obviously sub optimal, especially when the mount
>> point is unmounted.
>>
>> In my case, I called umount but the rpc.statd process was still running:
> What is the expectation? When the umount should bring down rpc.statd?

If it started it's own instance, it should IMO kill it again on umount,
but I was more talking about the systemd context here. If the
start-statd thing had done it's job correctly we wouldn't have gotten
into this situation in the first place (as rpc-statd.service would have
been started and contained it's own rpc.statd process happily!t's

I don't really know how it should work on non-systemd systems as in that
case I presume start-statd is called for each mount there (forgive me if
I'm wrong) and thus you'd likely end up with lots of rpc.statd processes
running, especially if you do lots of mount/umount cycles on a given
share. Perhaps all you need to do is some very, very minimal fallback
support here? e.g. checking the pid file and that the process of that
pid is an rpc.statd process and only actually start it if it's not
already running?

For systemd systems generally all that would happen is you've have a lot
of redundant calls to systemctl start, but they should generally be
harmless.

FWIW, I think there are a number of issues with the systemd units
upstream. If you're interested in some feedback here is a small snippet.
I'm happy to do some patches for you if you'd be willing to apply them.

Main issues are:

1. nfs-utils.service really is an abuse. It should be a nfs-utils.target
(the comment inside aludes to the fact this is know, and it's just that
you want "systemctl restart nfs-utils" to "just work" as a command. I
can see the desire, but it's still an abuse. Perhaps someone would be
willing to write a patch that does expansion to .service or .target
should the unit type not be specified? Dunno how hard it would be tho'....

2. The way nfs-blkmap.service/target interact seems really non-standard.
The fact that nfs-blkmap.service has no [Install] section will make it
report oddly in systemctl status (it will not say "enabled" or
"disabled" but "static"). The use of Requisite= to tie it to it's target
is, well, "creative". Personally, I'd have it as a

3. rpc-svcgssd.service and nfs-mountd.service have two PartOf=
directives. This could lead to a very strange state where e.g.
nfs-server.service is stopped, and thus the stop job is propigated to
both these units, but they are actually still needed by nfs clients (I
believe) as you also list it as part of nfs-utils.service (which as I
mentioned already is really an abuse of a service.

4. Numerous units make use of /etc/sysconfig/* tree. This is very much
discourage for upstream units and the official mechanism for tweaking
units is to put a dropin file in
/etc/systemd/system/theservice.service.d/my-overrides.conf

In these files you can tweak a single line, typically the Exec= line, or
add an Environment= line, without altering anything else.

If this is the blessed upstream way I'd follow this approach as it's
intended to be distro agnostic.

There are probably a couple other issues too, but it's getting late and
my eyes are suffering!

I appreciate the collection of units here is complex for NFS (I've been
fighting with it downstream for years - think we've maybe exchanged
emails on this topic in the past too!!), but it would be really good to
get a really shining example of perfect systemd units into upstream
nfs-utils as an example of how a complex collection of daemons should be
represented.

As I mentioned, I'd be happy to work with you to get these really good
if you like?

>> [root at jimmy nfs-utils]$ pscg | grep 3256
>>  3256 rpcuser
>> 4:devices:/system.slice/mnt-media-scratch.mount,1:name=systemd:/system.slice/mnt-media-scratch.mount
>> rpc.statd --no-notify
>>
>> [root at jimmy nfs-utils]$ systemctl status mnt-media-scratch.mount
>> ● mnt-media-scratch.mount - /mnt/media/scratch
>>    Loaded: loaded (/etc/fstab)
>>    Active: inactive (dead) since Mon 2015-01-12 09:58:52 GMT; 1min 12s ago
>>     Where: /mnt/media/scratch
>>      What: marley.rasta.guthr.ie:/mnt/media/scratch
>>      Docs: man:fstab(5)
>>            man:systemd-fstab-generator(8)
>>
>> Jan 07 14:55:13 jimmy mount[3216]: /usr/sbin/start-statd: line 8:
>> systemctl: command not found
>> Jan 07 14:55:14 jimmy rpc.statd[3256]: Version 1.3.0 starting
>> Jan 07 14:55:14 jimmy rpc.statd[3256]: Flags: TI-RPC
>> [root at jimmy nfs-utils]$
> Again this is fixed with the latest nfs-utils...

Yeah, this is just the logs for the stuff mentioned above, so I
appreciate it's now fixed, thanks :)

> Question? Why are you using v3 mounts? With V4 all this goes away.

Generally due to me fighting (and losing) with nfs over the years! Will
try and migrate to v4 :)

Cheers!

Col

-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/