[systemd-devel] systemd-nspawn container not starting on RHEL9.0
Thomas Archambault
toma at TPArchambault.com
Thu Aug 4 17:30:26 UTC 2022
Following up on xfs and reflinks, it appears they are enabled on my
out-of-box RHEL9.0. Fwiw, this is a VBox VM however so if the FC34
system which works correctly, but is using btrfs.
As always, appreciate any help/references.
TIA
-Tom
[toma at localhost ~]$ xfs_info /
meta-data=/dev/mapper/rhel-root isize=512 agcount=4, agsize=4185600 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=1 inobtcount=1
data = bsize=4096 blocks=16742400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=8175, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[toma at localhost ~]$
-------- Forwarded Message --------
Subject: Re: systemd-devel Digest, Vol 148, Issue 2
Date: Thu, 4 Aug 2022 11:22:32 -0400
From: Thomas Archambault <toma at TPArchambault.com>
Reply-To: toma at TPArchambault.com
To: systemd-devel-request at lists.freedesktop.org
Thank you Lennart. Very much appreciate the quick and clear response.
You're absolutely correct about the btrfs/xfs difference between the
working FC34 system and the problematic RHEL9.0 system:
> /dev/mapper/rhel-root on / type xfs
(rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
My strace output did indicate that there are copying going on but I did
not know if that that was a problem or not. Obviously it can be in terms
of start-up time and UX w/xfs.
- Tom
On 8/4/22 08:00, systemd-devel-request at lists.freedesktop.org wrote:
> Send systemd-devel mailing list submissions to
> systemd-devel at lists.freedesktop.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> or, via email, send a message with subject or body 'help' to
> systemd-devel-request at lists.freedesktop.org
>
> You can reach the person managing the list at
> systemd-devel-owner at lists.freedesktop.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of systemd-devel digest..."
>
>
> Today's Topics:
>
> 1. systemd-nspawn container not starting on RHEL9.0
> (Thomas Archambault)
> 2. Re: systemd-nspawn container not starting on RHEL9.0
> (Lennart Poettering)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 3 Aug 2022 15:40:21 -0400
> From: Thomas Archambault <toma at TPArchambault.com>
> To: systemd-devel at lists.freedesktop.org
> Subject: [systemd-devel] systemd-nspawn container not starting on
> RHEL9.0
> Message-ID: <2d4567ae-f0e5-9e6a-10fe-9592498c6c6e at TPArchambault.com>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
> Good day everyone on the dev list,
> We are adding an analysis tool to our application that uses the host's
> rootfs as one of its inputs.
>
> As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
> isolated container environment using the host's rootfs as the
> container's rootfs and things worked correctly and as expected. The
> host's rootfs is analyzed with tmp and results files generated within
> the container without persistent modifications affecting the host's
> rootfs. Since RHEL is our ultimate target platform, I've been trying to
> duplicate our work over RHEL9.0 without success with the container not
> being instantiated.
>
> I've tried to boil down the duplication code to the simplest example,
> which is also an example in the man page $ sudo systemd-nspawn -xbD/. As
> with my prototyping, the container does not seem to be instantiated.
> Any help with troubleshooting, or specific known issues, or requests for
> more data would be appreciated.
>
> TIA
> tparchambault
> ps: Regarding security - selinux is in Permissive mode. I do not know if
> seccomp filters are getting in the way or not; This is an out-ot-the-box
> RHEL9.0 base workstation install. In the FC34 prototype, I did need to
> allow certain syscalls via --system-call-filter in order to get a daemon
> within the container to run correctly but afaik that should have no
> bearing on the instantiation of the container.
>
>
> ==== On a RHEL9.0 host bash session ====
>
> [toma at localhost ~]$ systemctl --version
> systemd 250 (250-6.el9_0)
> +PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS
> +OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD
> +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4
> +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT
> default-hierarchy=unified
>
> [toma at localhost ~]$ uname -a
> Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT
> Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
> [toma at localhost ~]$
>
> [toma at localhost ~]$ sudo time systemd-nspawn -D / -xb
> ^C^C^C^C^CCommand terminated by signal 15
> 40.81user 298.75system 6:29.72elapsed 87%CPU (0avgtext+0avgdata
> 8524maxresident)k
> 205032inputs+0outputs (0major+3287minor)pagefaults 0swaps
> [toma at localhost ~]$
>
> ==== In another bash session on the same host ====
> [toma at localhost ~]$ sudo machinectl list
> [sudo] password for toma:
> No machines.
> [toma at localhost ~]$ sudo pkill nspawn
> [toma at localhost ~]$
>
> == In the original host bash session, w/increased logging and strace
> capture ==
>
> [toma at localhost ~]$ sudo SYSTEMD_LOG_LEVEL=debug strace -o
> Development/nspawn.strace.rhel90.out systemd-nspawn -D / -xb
> [sudo] password for toma:
> Setting RLIMIT_CPU to infinity.
> Setting RLIMIT_FSIZE to infinity.
> Setting RLIMIT_DATA to infinity.
> Setting RLIMIT_STACK to 8388608:infinity.
> Setting RLIMIT_CORE to 0:infinity.
> Setting RLIMIT_RSS to infinity.
> Setting RLIMIT_NPROC to 14657.
> Setting RLIMIT_NOFILE to 1024:524288.
> Setting RLIMIT_MEMLOCK to 65536.
> Setting RLIMIT_AS to infinity.
> Setting RLIMIT_LOCKS to infinity.
> Setting RLIMIT_SIGPENDING to 14657.
> Setting RLIMIT_MSGQUEUE to 819200.
> Setting RLIMIT_NICE to 0.
> Setting RLIMIT_RTPRIO to 0.
> Setting RLIMIT_RTTIME to infinity.
> Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> Terminated
> [toma at localhost ~]$
>
> As with the first run, killed via pkill from the other terminal session.
>
> Fwiw, on Fedora 34, the log debug output shows the instantiation of the
> container after the "Found csgroup2..." line, with the container
> working as
> documented eventually presenting the login prompt, i.e.
>
> ...
> Setting RLIMIT_RTTIME to infinity.
> Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> Spawning container fedora-1aabc34e0a52a82b on /.#machine.6e49b8aa974c6f37.
> Press ^] three times within 1s to kill container.
> Outer child is initializing.
> Mounting / (MS_REC|MS_SLAVE "")...
> ...
>
> [? OK? ] Finished Update UTMP about System Runlevel Changes.
>
> Fedora 34 (Workstation Edition)
> Kernel 5.11.12-300.fc34.x86_64 on an x86_64 (console)
>
> fedora-1aabc34e0a52a82b login:
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220803/359f5243/attachment-0001.htm>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 4 Aug 2022 09:30:26 +0200
> From: Lennart Poettering <lennart at poettering.net>
> To: Thomas Archambault <toma at tparchambault.com>
> Cc: systemd-devel at lists.freedesktop.org
> Subject: Re: [systemd-devel] systemd-nspawn container not starting on
> RHEL9.0
> Message-ID: <Yut1kq+IsLkSYdeg at gardel-login>
> Content-Type: text/plain; charset=us-ascii
>
> On Mi, 03.08.22 15:40, Thomas Archambault (toma at TPArchambault.com) wrote:
>
>> Good day everyone on the dev list,
>> We are adding an analysis tool to our application that uses the host's
>> rootfs as one of its inputs.
>>
>> As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
>> isolated container environment using the host's rootfs as the container's
>> rootfs and things worked correctly and as expected. The host's rootfs is
>> analyzed with tmp and results files generated within the container
>> without
>> persistent modifications affecting the host's rootfs. Since RHEL is our
>> ultimate target platform, I've been trying to duplicate our work over
>> RHEL9.0 without success with the container not being instantiated.
>>
>> I've tried to boil down the duplication code to the simplest example,
>> which
>> is also an example in the man page $ sudo systemd-nspawn -xbD/. As
>> with my
>> prototyping, the container does not seem to be instantiated.
>> Any help with troubleshooting, or specific known issues, or requests for
>> more data would be appreciated.
> "-x" is ephemeral mode. This means nspawn will make a copy of the OS
> tree before booting into it, and remove it afterwards.
>
> "-x" on btrfs is very fast and space efficient, because btrfs supports
> both snapshots and reflinks. nspawn will make a subvol snapshot if the
> root you specify is a subvol. It will make reflink-based file copies
> otherwise.
>
> Other file systems have a more 1990's feature set, i.e. no reflinks
> nor snapshots. (modern xfs on very new kernels can support reflinks if
> this is opt-in'ed to.) In that case we have to copy the data files
> with their contents, and that's slow.
>
> Hence, what backing fs do you use?
>
> if you use non-btrfs it might hence simply be that we are busy
> individually copying all files...
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
>
> ------------------------------
>
> End of systemd-devel Digest, Vol 148, Issue 2
> *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220804/9de7a74e/attachment.htm>
More information about the systemd-devel
mailing list