[systemd-devel] [ANNOUNCE] systemd 214
Lennart Poettering
lennart at poettering.net
Wed Jun 11 10:00:41 PDT 2014
Hi!
http://www.freedesktop.org/software/systemd/systemd-214.tar.xz
Here it is, version 214. Stuffed with great new features, improvements
in all areas, in particular when it comes to security (file system
sandboxing services! minimizing privileges of our daemons!), networking
(three new interface types are now supported by networkd!) and socket
units (four new settings!). What I find the most exciting change: a
first step towards a state-less system: we will now rebuild /var if it
is empty on boot. My favourite new command line making use of this is:
systemd-nspawn -D /srv/mycontainer --read-only --tmpfs=/var -b
Which spawns an nspawn container, with the directory tree mounted
read-only, and an empty, volatile /var mounted on top, that is flushed
when you terminate the container. With that in place you can easily run
hundreds of ad-hoc throw-away container instances from the same tree,
while making sure they don't end up interfering with each other. As next
step (planned for the next release): add the infrastructure to support
boots with /etc empty, too (or to turn this around: with a tmpfs as root
and only /usr mounted in from a read-only vendor tree).
Anyway, I am rambling, so here's the dry NEWS file, enjoy:
CHANGES WITH 214:
* As an experimental feature, udev now tries to lock the
disk device node (flock(LOCK_SH|LOCK_NB)) while it
executes events for the disk or any of its partitions.
Applications like partitioning programs can lock the
disk device node (flock(LOCK_EX)) and claim temporary
device ownership that way; udev will entirely skip all event
handling for this disk and its partitions. If the disk
was opened for writing, the close will trigger a partition
table rescan in udev's "watch" facility, and if needed
synthesize "change" events for the disk and all its partitions.
This is now unconditionally enabled, if it turns out to
cause major problems, we might turn it on only for specific
devices, or might need to disable it entirely. Device-mapper
devices are excluded from this logic.
* We temporarily dropped the "-l" switch for fsck invocations,
since they collide with the flock() logic above. util-linux
upstream has been changed already to avoid this conflict,
and we will readd "-l" as soon as util-linux with this
change has been released.
* The dependency on libattr has been removed. Since a long
time the extended attribute calls have moved to glibc, and
libattr is thus unnecessary.
* Virtualization detection works without priviliges now. This
means the systemd-detect-virt binary no longer requires
CAP_SYS_PTRACE file capabilities, and our daemons can run
with fewer privileges.
* systemd-networkd now runs under its own "systemd-network"
user. It retains the CAP_NET_ADMIN, CAP_NET_BIND_SERVICE,
CAP_NET_BROADCAST, CAP_NET_RAW capabilities though, but
loses the ability to write to files owned by root this way.
* Similar, systemd-resolved now runs under its own
"systemd-resolve" user with no capabilities remaining.
* Similar, systemd-bus-proxyd now runs under its own
"systemd-bus-proxy" user with only CAP_IPC_OWNER remaining.
* systemd-networkd gained support for setting up "veth"
virtual ethernet devices for container connectivity, as well
as GRE and VTI tunnels.
* systemd-networkd will no longer automatically attempt to
manually load kernel modules necessary for certain tunnel
transports. Instead it is assumed the kernel loads them
automatically when required. This only works correctly on
very new kernels. On older kernels, please consider adding
the kernel modules to /etc/load-modules.d/ as a work-around.
* The resolv.conf file systemd-resolved generates has been
moved to /run/systemd/resolve/, if you have a symlink from
/etc/resolv.conf it might be necessary to correct it.
* Two new service settings ProtectedHome= and ProtectedSystem=
have been added. When enabled they will make the user data
(such as /home) inaccessible or read-only and the system
(such as /usr) read-only, for specific services. This allows
very light-weight per-service sandboxing to avoid
modifications of user data or system files from
services. These two new switches have been enabled for all
of systemd's long-running services, where appropriate.
* Socket units gained new SocketUser= and SocketGroup=
settings to set the owner user and group of AF_UNIX sockets
and FIFOs in the file system.
* Socket units gained a new RemoveOnStop= setting. If enabled
all FIFOS and sockets in the file system will be removed
when the specific socket unit is stopped.
* Socket units gained a new Symlinks= setting. It takes a list
of symlinks to create to file system sockets or FIFOs
created by the specific unix sockets. This is useful to
manage symlinks to socket nodes with the same life-cycle as
the socket itself.
* The /dev/log socket and /dev/initctl FIFO have been moved to
/run, and have been replaced by symlinks. This allows
connecting to these facilities even if PrivateDevices=yes is
used for a service (which makes /dev/log itself unavailable,
but /run is left). This also has the benefit of ensuring
that /dev only contains device nodes, directories and
symlinks, and nothing else.
* sd-daemon gained two new calls sd_pid_notify() and
sd_pid_notifyf(). They are similar to sd_notify() and
sd_notifyf(), but allow overriding of the source PID of
notification messages if permissions permit this. This is
useful to send notify messages on behalf of a different
process (for example, the parent process). The
systemd-notify tool has been updated to make use of this
when sending messages (so that notification messages now
originate from the shell script invoking systemd-notify and
when the specific socket unit is stopped.
* Socket units gained a new Symlinks= setting. It takes a list
of symlinks to create to file system sockets or FIFOs
created by the specific unix sockets. This is useful to
manage symlinks to socket nodes with the same life-cycle as
the socket itself.
* The /dev/log socket and /dev/initctl FIFO have been moved to
/run, and have been replaced by symlinks. This allows
connecting to these facilities even if PrivateDevices=yes is
used for a service (which makes /dev/log itself unavailable,
but /run is left). This also has the benefit of ensuring
that /dev only contains device nodes, directories and
symlinks, and nothing else.
* sd-daemon gained two new calls sd_pid_notify() and
sd_pid_notifyf(). They are similar to sd_notify() and
sd_notifyf(), but allow overriding of the source PID of
notification messages if permissions permit this. This is
useful to send notify messages on behalf of a different
process (for example, the parent process). The
systemd-notify tool has been updated to make use of this
when sending messages (so that notification messages now
originate from the shell script invoking systemd-notify and
not the systemd-notify process itself. This should minimize
a race where systemd fails to associate notification
messages to services when the originating process already
vanished.
* A new "on-abnormal" setting for Restart= has been added. If
set it will result in automatic restarts on all "abnormal"
reasons for a process to exit, which includes unclean
signals, core dumps, timeouts and watchdog timeouts, but
does not include clean and unclean exit codes or clean
signals. Restart=on-abnormal is an alternative for
Restart=on-failure for services that shall be able to
terminate and avoid restarts on certain errors, by
indicating so with an unclean exit code. Restart=on-failure
or Restart=on-abnormal is now the recommended setting for
all long-running services.
* If the InaccessibleDirectories= service setting points to a
mount point (or if there are any submounts contained within
it), it is now attempted to completely unmount it, to make
the file systems truly unavailable for the respective
service.
* The ReadOnlyDirectories= service setting and
systemd-nspawn's --read-only parameter are now recursively
applied to all submounts, too.
* Mount units may now be created transiently via the bus APIs.
* The support for SysV and LSB init scripts has been removed
from the systemd daemon itself. Instead, it is now
implemented as a generator that creates native systemd units
from these scripts when needed. This enables us to remove a
substantial amount of legacy code from PID 1, following the
fact that many distributions only ship a very small number
of LSB/SysV init scripts nowadays.
* Priviliged Xen (dom0) domains are not considered
virtualization anymore by the virtualization detection
logic. After all, they generally have unrestricted access to
the hardware and usually are used to manage the unprivileged
(domU) domains.
* systemd-tmpfiles gained a new "C" line type, for copying
files or entire directories.
* systemd-tmpfiles "m" lines are now fully equivalent to "z"
lines. So far they have been non-globbing versions of the
latter, and have thus been redundant. In future it is
recommended to only use "z"; and "m" has hence been removed
from the documentation, even though it stays supported.
* A tmpfiles snippet to recreate the most basic structure in
/var has been added. This is enough to create the /var/run →
/run symlink and create a couple of structural
directories. This allows systems to boot up with an empty or
volatile /var. Of course, while with this change the core OS
now is capable with dealing with a volatile /var not all
user services are ready for it. However, we hope that sooner
or later many service daemons will be changed upstream so
that they are able to automatically create their necessary
directories in /var at boot, should they be missing. This is
the first step to allow state-less systems that only require
the vendor image for /usr to boot.
* systemd-nspawn has gained a new --tmpfs= switch to mount an
empty tmpfs instance to a specific directory. This is
particularly useful for making use of the automatic
reconstruction of /var (see above), by passing --tmpfs=/var.
* Access modes specified in tmpfiles snippets may now be
prefixed with "~", which indicates that they shall be masked
by whether the existing file or directly is currently
writable, readable or executable at all. Also, if specified
the sgid/suid/sticky bits will be masked for all
non-directories.
* A new passive target unit "network-pre.target" has been
added which is useful for services that shall run before any
network is configured, for example firewall scripts.
* The "floppy" group that previously owned the /dev/fd*
devices is no longer used. The "disk" group is now used
instead. Distributions should probably deprecate usage of
this group.
Contributions from: Camilo Aguilar, Christian Hesse, Colin Ian
King, Cristian Rodríguez, Daniel Buch, Dave Reisner, David
Strauss, Denis Tikhomirov, John, Jonathan Liu, Kay Sievers,
Lennart Poettering, Mantas Mikulėnas, Mark Eichin, Ronny
Chevalier, Susant Sahani, Thomas Blume, Thomas Hindoe Paaboel
Andersen, Tom Gundersen, Umut Tezduyar Lindskog, Zbigniew
Jędrzejewski-Szmek
-- Berlin, 2014-06-11
Lennart
--
Lennart Poettering, Red Hat
More information about the systemd-devel
mailing list