[systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

Thu Jan 15 01:28:25 PST 2015

Steve Dickson wrote on 15/01/15 00:50:
> Hello,
> 
> On 01/12/2015 04:43 PM, Colin Guthrie wrote:
>>
>> But FWIW, your check for whether systemctl is installed via calling
>> "systemctl --help" is IMO not very neat.
>>
>> If you're using bash here anyway, you might as well just do a:
>>
>> if [ -d /sys/fs/cgroup/systemd ]; then
>>
>> type check or if you want to be super sure you could do:
>>
>> if mountpoint -q /sys/fs/cgroup/systemd; then
>>
>> This is a simple trick to detect if systemd is running. If systemctl is
>> then not found, then I'd be pretty surprised (but your code would
>> continue if the call fails anyway, so this should be safe).
>>
>> This avoids one more fork.
>>
>> Technicaly you could avoid calling "systemctl start" by calling
>> "systemctl is-active" first, but to be honest this isn't really needed.
> I took Michael advice and used 'test -d /run/systemd/system'

Seems best indeed yes. Thanks (to both!) :)

Although if the script is in bash I'd use  "if [ -d ..." rather than "if
test -d ..." as (and bash experts (Harald?) can correct me here if I'm
wrong) I believe [ is a bash built in (even if it is a binary in
/usr/bin/), whereas it would have to fork out to run "test".

>>>> That seems to work OK (from a practical perspective things worked OK and
>>>> I got my mount) but are obviously sub optimal, especially when the mount
>>>> point is unmounted.
>>>>
>>>> In my case, I called umount but the rpc.statd process was still running:
>>> What is the expectation? When the umount should bring down rpc.statd?
>>
>> If it started it's own instance, it should IMO kill it again on umount,
>> but I was more talking about the systemd context here. If the
>> start-statd thing had done it's job correctly we wouldn't have gotten
>> into this situation in the first place (as rpc-statd.service would have
>> been started and contained it's own rpc.statd process happily!t's
>>
>> I don't really know how it should work on non-systemd systems as in that
>> case I presume start-statd is called for each mount there (forgive me if
>> I'm wrong) and thus you'd likely end up with lots of rpc.statd processes
>> running, especially if you do lots of mount/umount cycles on a given
>> share. Perhaps all you need to do is some very, very minimal fallback
>> support here? e.g. checking the pid file and that the process of that
>> pid is an rpc.statd process and only actually start it if it's not
>> already running?
> Well, there is code in rpc.statd, sm-notify and mount.nfs that checks 
> to see if a rpc.statd is already running... But the code appears
> to be a bit racy since in very a few environments, multiple rpc.statds
> are being started up... 

Yeah, I actually doubted myself the other day when I made a suggestion
regarding doing some code to make sure only one was running... I later
remembered that rpc.statd had a pid file and thus must have this stuff
sort of built in (and I remember seeing messages along the lines of
rpc.statd is already running).

I guess the reason I got two was due to the extreme parallelism that
systemd offers on boot. My two mounts (with the faulty start-statd) must
have come in at almost the same time and triggered the race in rpc.statd
startup and I got two processes.

I don't suppose there is much we can do about that other than teaching
rpc.statd to be less racy, but to be honest, this should be avoided with
systemd now (thanks to the fixed start-statd) and other inits probably
won't trigger the race condition, so, practically speaking, it's
probably one to quietly ignore... at least for this list :D

>> For systemd systems generally all that would happen is you've have a lot
>> of redundant calls to systemctl start, but they should generally be
>> harmless.
> Well, the environment I just described, where multiple statd getting
> started which are going using systemd to do the start ups.

On systemd systems, it should all be fine yes. It's only really a
problem when on non-systemd systems now that start-statd is working
properly!

>> FWIW, I think there are a number of issues with the systemd units
>> upstream. If you're interested in some feedback here is a small snippet.
>> I'm happy to do some patches for you if you'd be willing to apply them.
> Always... But I would like to have this conversation with the
> NFS community at linux-nfs at vger.kernel.org. Maybe you could post
> your ideas there? In a new thread?

Sure I will do!

>> Main issues are:
>>
>> 1. nfs-utils.service really is an abuse. It should be a nfs-utils.target
>> (the comment inside aludes to the fact this is know, and it's just that
>> you want "systemctl restart nfs-utils" to "just work" as a command. I
>> can see the desire, but it's still an abuse. Perhaps someone would be
>> willing to write a patch that does expansion to .service or .target
>> should the unit type not be specified? Dunno how hard it would be tho'....
> Well we did make the nfs-client a target but the nfs-server was made 
> a service... I remember bring this point up at the time... but I forget 
> what was said.

The mists of time hide all ;)

>> 2. The way nfs-blkmap.service/target interact seems really non-standard.
>> The fact that nfs-blkmap.service has no [Install] section will make it
>> report oddly in systemctl status (it will not say "enabled" or
>> "disabled" but "static"). The use of Requisite= to tie it to it's target
>> is, well, "creative". Personally, I'd have it as a
> I see your point... 
> 
>>
>> 3. rpc-svcgssd.service and nfs-mountd.service have two PartOf=
>> directives. This could lead to a very strange state where e.g.
>> nfs-server.service is stopped, and thus the stop job is propigated to
>> both these units, but they are actually still needed by nfs clients (I
>> believe) as you also list it as part of nfs-utils.service (which as I
>> mentioned already is really an abuse of a service.
> At this point rpc-svcgssd.service not even being used, at least
> in the distos I work with... The point being use BindsTo instead 
> of PartOf?

Possibly! To be honest, I'd probably have to think about it harder (and
probably understand the components better too!), but the main point was
to avoid two PartOf= directives (and also, never have both PartOf= and
BindsTo= or multiple BindsTo= for the same reason!)

>> 4. Numerous units make use of /etc/sysconfig/* tree. This is very much
>> discourage for upstream units and the official mechanism for tweaking
>> units is to put a dropin file in
>> /etc/systemd/system/theservice.service.d/my-overrides.conf
> Like it or not people expect things to be in /etc/sysconf/. From a distro
> stand point that would be a very hard thing to change. But.. ff there
> was a seamless way to make that change... That would be interesting... 

Yeah from a distro perspective I'd say it's fine to hack this into the
units. I know myself that it's hard to change (both peoples expectations
and various tools which tweak those files)

My main point was really to suggest that this is not part of the
upstream units. Patching in sysconfig support downstream is fine tho' in
the short term, but people should really be gently prodded in the
"right" direction over time.

>> I appreciate the collection of units here is complex for NFS (I've been
>> fighting with it downstream for years - think we've maybe exchanged
>> emails on this topic in the past too!!), but it would be really good to
>> get a really shining example of perfect systemd units into upstream
>> nfs-utils as an example of how a complex collection of daemons should be
>> represented.
>>
>> As I mentioned, I'd be happy to work with you to get these really good
>> if you like?
> Yes... But again, I would like to work with the NFS community on this... 

Cool, will do!

>>> Question? Why are you using v3 mounts? With V4 all this goes away.
>>
>> Generally due to me fighting (and losing) with nfs over the years! Will
>> try and migrate to v4 :)
> Its getting better... v4.0 is better, v4.1 is much better! There are already
> patches posted chang mount.nfs to start the protocol negotiation at v4.1
> instead of v4.0 as it does today. 

Nice to know :)

Will try and engage upstream and help get the units sorted out with you :)

Cheers!

Col

-- 

Colin Guthrie
colin(at)mageia.org
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/