[systemd-devel] Discrepancy in using dhclient b/w ubuntu 20.04 and ubuntu 16.04

Reindl Harald h.reindl at thelounge.net
Tue Jun 8 13:05:43 UTC 2021



Am 08.06.21 um 14:50 schrieb Aravindhan Krishnan:
> Hi Reindl,
> 
> I have attached a minimalistic repro along with the codes of all the 
> scripts, service files. I suppose Silvio was able to see the files. 

i don't get the bash-nonsense for a handful of lines (most of them doing 
nothing at all) to begin with and given that there is no "Type=" in the 
unit file you may read the docs and try the different types

i also don't get the trial-binary

why in the world don't you trhow away all that crap inlcuding the docker 
container and start dhclient at your own from a trivial systemd-unit?

it's impressive how many layers and helpers one can wrap around simple 
tasks but to gain what except troubles?

keep it simple!

> On Mon, 7 Jun 2021 at 21:53, Reindl Harald <h.reindl at thelounge.net 
> <mailto:h.reindl at thelounge.net>> wrote:
> 
> 
> 
>     Am 07.06.21 um 17:57 schrieb Aravindhan Krishnan:
>      > Adding Raghav.
>      >
>      > And sorry the subject should have stated: Discrepancy in using
>     dhclient
>      > b/w ubuntu 20.04 and ubuntu 16.04
> 
>     and why didn't you fix it in your own reply?
> 
>     to your problem:
>     you have a wild mix of docker, systemd-units and shellscripts but don't
>     provide the source of the scripts nor the systemd unit
> 
>     overly complex for something that can be trivial as:
> 
>     [root at srv-rhsoft:~]$ cat /etc/systemd/system/network-wan-dhcp.service
>     [Unit]
>     Description=Internet DHCP-Client
> 
>     [Service]
>     Type=forking
>     ExecStart=/usr/sbin/dhclient -4 -q --no-pid --request-options
>     subnet-mask,broadcast-address,routers br-wan
>     PermissionsStartOnly=yes
>     SuccessExitStatus=80
>     Restart=always
>     RestartSec=5
>     ProtectSystem=strict
>     ProtectHome=yes
>     ReadWritePaths=-/var/lib/dhclient
>     PrivateTmp=yes
>     NoNewPrivileges=yes
>     ProtectKernelTunables=yes
>     ProtectKernelModules=yes
>     ProtectControlGroups=yes
>     MemoryDenyWriteExecute=yes
>     CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE
>     CAP_NET_BROADCAST CAP_NET_RAW
>     LockPersonality=yes
>     PrivateDevices=yes
>     ProtectHostname=yes
>     RestrictNamespaces=yes
>     RestrictRealtime=yes
>     RestrictSUIDSGID=yes
>     ProtectClock=true
>     ProtectKernelLogs=true
>     UMask=077
>     SystemCallArchitectures=native
>     SystemCallFilter=@system-service @network-io @privileged
>     SystemCallFilter=~@aio @chown @clock @cpu-emulation @debug @keyring
>     @module @mount @obsolete @raw-io @reboot @resources @swap
>     InaccessiblePaths=-/boot
>     InaccessiblePaths=-/efi
>     InaccessiblePaths=-/root
> 
>      > On Mon, 7 Jun 2021 at 21:26, Aravindhan Krishnan
>      > <aravindhank11 at gmail.com <mailto:aravindhank11 at gmail.com>
>     <mailto:aravindhank11 at gmail.com <mailto:aravindhank11 at gmail.com>>>
>     wrote:
>      >
>      >     Hi Folks,
>      >
>      >     I am finding anomalous behavior when I am trying to run dhclient
>      >     process inside my docker container in vanilla Ubuntu 16.04
>     host. The
>      >     service gets into "deactivating" state and is stuck forever.
>     In the
>      >     mail I have attached a minimalistic reproduction of the issue
>     seen.
>      >
>      >     Working logic:
>      >
>      >       * There is a sample trial at .service script which invokes the
>      >         `trial` binary with the option passed to the systemd
>     service via
>      >         @ option
>      >       * The valid options are sleep and dhclient_<interface_name>
>      >       * The binary either invokes a long-lived sleep process or
>     dhclient
>      >         process on the said interface_name based on the input
>      >       * The binary then spawns `kill_trial.sh` script. The script
>     sleeps
>      >         for 20 seconds and kills the parent `trial` binary. The kill
>      >         signal is SIGKILL in the trial example. In the
>     real-world, this
>      >         can be a SIGSEGV indicating a crash in the parent process.
>      >       * If the trial binary was started for sleep process things work
>      >         fine and service goes into "failed" state as expected
>      >       * However, in case of dhclient, the service is stuck in
>      >         "deactivating" state if the underlying host OS is Ubuntu
>     16.04.
>      >         This works well if the host is running Ubuntu 20.04.
>      >       * We have kept TimeoutStopSec to infinity, because in real-word
>      >         deployments, the core collection post a crash takes
>     varying time
>      >         depending on the memory config on the host.
>      >
>      >
>      >     Steps to reproduce
>      >     # tar -xf minimal_repro.tar -C minimal_repro/
>      >     # cd minimal_repro/
>      >     # docker build -t trial .
>      >     # docker rm -f trial
>      >     # docker run -it -d --net=host --privileged -v
>      >     /sys/fs/cgroup:/sys/fs/cgroup:ro --name trial trial
>      >     # docker exec -it trial bash
>      >
>      >     # systemctl start trial at dhclient_eth1.service
>      >
>      >     # #Keep monitoring trial at dhclient_eth1.service -- issue should be
>      >     seen within 20-30 seconds on Ubuntu 16.04 host
>      >
>      >     # systemctl status trial at dhclient_eth1.service
>      >     ● trial at dhclient_eth1.service - Trial
>      >           Loaded: loaded (/etc/systemd/system/trial at .service; static;
>      >     vendor preset: enabled)
>      >           Active: deactivating (stop-sigterm) (Result: signal)
>     since Mon
>      >     2021-06-07 13:19:12 UTC; 1min 11s ago
>      >          Process: 55 ExecStartPre=/bin/bash
>      >     /etc/systemd/system/trial_service_script.sh pre_start
>     dhclient_eth1
>      >     (code=exited, status=0/SUCCESS)
>      >          Process: 56 ExecStart=/bin/bash
>      >     /etc/systemd/system/trial_service_script.sh start dhclient_eth1
>      >     (code=killed, signal=KILL)
>      >         Main PID: 56 (code=killed, signal=KILL)
>      >            Tasks: 0 (limit: 38590)
>      >           Memory: 588.0K
>      >           CGroup:
>      >   
>       /docker/903fca0cee1387b7c2113a36ee5efdb3a25edd1e60584fe5da5d0c5b5ffd8241/system.slice/system-trial.slice/trial at dhclient_eth1.service
>      >
>      >     # #NOTE: `Active: deactivating` -- in stuck state
>      >     # #Running `systemctl daemon-reload` forces the service to go to
>      >     failed state
>      >
>      >     # systemctl start trial at sleep.service
>      >
>      >     # #Keep monitoring trial at sleep.service -- would be killed in
>     20-30
>      >     seconds and goes into failed state as expected
>      >
>      >     # # systemctl status trial at sleep.service
>      >     ● trial at sleep.service - Trial
>      >           Loaded: loaded (/etc/systemd/system/trial at .service; static;
>      >     vendor preset: enabled)
>      >           Active: failed (Result: signal) since Mon 2021-06-07
>     13:38:19
>      >     UTC; 21s ago
>      >          Process: 113 ExecStartPre=/bin/bash
>      >     /etc/systemd/system/trial_service_script.sh pre_start sleep
>      >     (code=exited, status=0/SUCCESS)
>      >          Process: 114 ExecStart=/bin/bash
>      >     /etc/systemd/system/trial_service_script.sh start sleep
>      >     (code=killed, signal=KILL)
>      >          Process: 129 ExecStopPost=/bin/bash
>      >     /etc/systemd/system/trial_service_script.sh post_stop sleep
>      >     (code=exited, status=0/SUCCESS)
>      >         Main PID: 114 (code=killed, signal=KILL)
>      >
>      >     Please advise on what can help us in alleviating the issue.



More information about the systemd-devel mailing list