[systemd-devel] [ANNOUNCE] systemd 214

Lennart Poettering lennart at poettering.net
Wed Jun 11 10:00:41 PDT 2014


Hi!

http://www.freedesktop.org/software/systemd/systemd-214.tar.xz

Here it is, version 214. Stuffed with great new features, improvements
in all areas, in particular when it comes to security (file system
sandboxing services! minimizing privileges of our daemons!), networking
(three new interface types are now supported by networkd!) and socket
units (four new settings!). What I find the most exciting change: a
first step towards a state-less system: we will now rebuild /var if it
is empty on boot. My favourite new command line making use of this is: 

        systemd-nspawn -D /srv/mycontainer --read-only --tmpfs=/var -b

Which spawns an nspawn container, with the directory tree mounted
read-only, and an empty, volatile /var mounted on top, that is flushed
when you terminate the container. With that in place you can easily run
hundreds of ad-hoc throw-away container instances from the same tree,
while making sure they don't end up interfering with each other. As next
step (planned for the next release): add the infrastructure to support
boots with /etc empty, too (or to turn this around: with a tmpfs as root
and only /usr mounted in from a read-only vendor tree).

Anyway, I am rambling, so here's the dry NEWS file, enjoy:

CHANGES WITH 214:

        * As an experimental feature, udev now tries to lock the
          disk device node (flock(LOCK_SH|LOCK_NB)) while it
          executes events for the disk or any of its partitions.
          Applications like partitioning programs can lock the
          disk device node (flock(LOCK_EX)) and claim temporary
          device ownership that way; udev will entirely skip all event
          handling for this disk and its partitions. If the disk
          was opened for writing, the close will trigger a partition
          table rescan in udev's "watch" facility, and if needed
          synthesize "change" events for the disk and all its partitions.
          This is now unconditionally enabled, if it turns out to
          cause major problems, we might turn it on only for specific
          devices, or might need to disable it entirely. Device-mapper
          devices are excluded from this logic.

        * We temporarily dropped the "-l" switch for fsck invocations,
          since they collide with the flock() logic above. util-linux
          upstream has been changed already to avoid this conflict,
          and we will readd "-l" as soon as util-linux with this
          change has been released.

        * The dependency on libattr has been removed. Since a long
          time the extended attribute calls have moved to glibc, and
          libattr is thus unnecessary.

        * Virtualization detection works without priviliges now. This
          means the systemd-detect-virt binary no longer requires
          CAP_SYS_PTRACE file capabilities, and our daemons can run
          with fewer privileges.

        * systemd-networkd now runs under its own "systemd-network"
          user. It retains the CAP_NET_ADMIN, CAP_NET_BIND_SERVICE,
          CAP_NET_BROADCAST, CAP_NET_RAW capabilities though, but
          loses the ability to write to files owned by root this way.

        * Similar, systemd-resolved now runs under its own
          "systemd-resolve" user with no capabilities remaining.

        * Similar, systemd-bus-proxyd now runs under its own
          "systemd-bus-proxy" user with only CAP_IPC_OWNER remaining.

        * systemd-networkd gained support for setting up "veth"
          virtual ethernet devices for container connectivity, as well
          as GRE and VTI tunnels.

        * systemd-networkd will no longer automatically attempt to
          manually load kernel modules necessary for certain tunnel
          transports. Instead it is assumed the kernel loads them
          automatically when required. This only works correctly on
          very new kernels. On older kernels, please consider adding
          the kernel modules to /etc/load-modules.d/ as a work-around.

        * The resolv.conf file systemd-resolved generates has been
          moved to /run/systemd/resolve/, if you have a symlink from
          /etc/resolv.conf it might be necessary to correct it.

        * Two new service settings ProtectedHome= and ProtectedSystem=
          have been added. When enabled they will make the user data
          (such as /home) inaccessible or read-only and the system
          (such as /usr) read-only, for specific services. This allows
          very light-weight per-service sandboxing to avoid
          modifications of user data or system files from
          services. These two new switches have been enabled for all
          of systemd's long-running services, where appropriate.

        * Socket units gained new SocketUser= and SocketGroup=
          settings to set the owner user and group of AF_UNIX sockets
          and FIFOs in the file system.

        * Socket units gained a new RemoveOnStop= setting. If enabled
          all FIFOS and sockets in the file system will be removed
          when the specific socket unit is stopped.

        * Socket units gained a new Symlinks= setting. It takes a list
          of symlinks to create to file system sockets or FIFOs
          created by the specific unix sockets. This is useful to
          manage symlinks to socket nodes with the same life-cycle as
          the socket itself.

        * The /dev/log socket and /dev/initctl FIFO have been moved to
          /run, and have been replaced by symlinks. This allows
          connecting to these facilities even if PrivateDevices=yes is
          used for a service (which makes /dev/log itself unavailable,
          but /run is left). This also has the benefit of ensuring
          that /dev only contains device nodes, directories and
          symlinks, and nothing else.

        * sd-daemon gained two new calls sd_pid_notify() and
          sd_pid_notifyf(). They are similar to sd_notify() and
          sd_notifyf(), but allow overriding of the source PID of
          notification messages if permissions permit this. This is
          useful to send notify messages on behalf of a different
          process (for example, the parent process). The
          systemd-notify tool has been updated to make use of this
          when sending messages (so that notification messages now
          originate from the shell script invoking systemd-notify and
          when the specific socket unit is stopped.

        * Socket units gained a new Symlinks= setting. It takes a list
          of symlinks to create to file system sockets or FIFOs
          created by the specific unix sockets. This is useful to
          manage symlinks to socket nodes with the same life-cycle as
          the socket itself.

        * The /dev/log socket and /dev/initctl FIFO have been moved to
          /run, and have been replaced by symlinks. This allows
          connecting to these facilities even if PrivateDevices=yes is
          used for a service (which makes /dev/log itself unavailable,
          but /run is left). This also has the benefit of ensuring
          that /dev only contains device nodes, directories and
          symlinks, and nothing else.

        * sd-daemon gained two new calls sd_pid_notify() and
          sd_pid_notifyf(). They are similar to sd_notify() and
          sd_notifyf(), but allow overriding of the source PID of
          notification messages if permissions permit this. This is
          useful to send notify messages on behalf of a different
          process (for example, the parent process). The
          systemd-notify tool has been updated to make use of this
          when sending messages (so that notification messages now
          originate from the shell script invoking systemd-notify and
          not the systemd-notify process itself. This should minimize
          a race where systemd fails to associate notification
          messages to services when the originating process already
          vanished.

        * A new "on-abnormal" setting for Restart= has been added. If
          set it will result in automatic restarts on all "abnormal"
          reasons for a process to exit, which includes unclean
          signals, core dumps, timeouts and watchdog timeouts, but
          does not include clean and unclean exit codes or clean
          signals. Restart=on-abnormal is an alternative for
          Restart=on-failure for services that shall be able to
          terminate and avoid restarts on certain errors, by
          indicating so with an unclean exit code. Restart=on-failure
          or Restart=on-abnormal is now the recommended setting for
          all long-running services.

        * If the InaccessibleDirectories= service setting points to a
          mount point (or if there are any submounts contained within
          it), it is now attempted to completely unmount it, to make
          the file systems truly unavailable for the respective
          service.

        * The ReadOnlyDirectories= service setting and
          systemd-nspawn's --read-only parameter are now recursively
          applied to all submounts, too.

        * Mount units may now be created transiently via the bus APIs.

        * The support for SysV and LSB init scripts has been removed
          from the systemd daemon itself. Instead, it is now
          implemented as a generator that creates native systemd units
          from these scripts when needed. This enables us to remove a
          substantial amount of legacy code from PID 1, following the
          fact that many distributions only ship a very small number
          of LSB/SysV init scripts nowadays.

        * Priviliged Xen (dom0) domains are not considered
          virtualization anymore by the virtualization detection
          logic. After all, they generally have unrestricted access to
          the hardware and usually are used to manage the unprivileged
          (domU) domains.

        * systemd-tmpfiles gained a new "C" line type, for copying
          files or entire directories.

        * systemd-tmpfiles "m" lines are now fully equivalent to "z"
          lines. So far they have been non-globbing versions of the
          latter, and have thus been redundant. In future it is
          recommended to only use "z"; and "m" has hence been removed
          from the documentation, even though it stays supported.

        * A tmpfiles snippet to recreate the most basic structure in
          /var has been added. This is enough to create the /var/run →
          /run symlink and create a couple of structural
          directories. This allows systems to boot up with an empty or
          volatile /var. Of course, while with this change the core OS
          now is capable with dealing with a volatile /var not all
          user services are ready for it. However, we hope that sooner
          or later many service daemons will be changed upstream so
          that they are able to automatically create their necessary
          directories in /var at boot, should they be missing. This is
          the first step to allow state-less systems that only require
          the vendor image for /usr to boot.

        * systemd-nspawn has gained a new --tmpfs= switch to mount an
          empty tmpfs instance to a specific directory. This is
          particularly useful for making use of the automatic
          reconstruction of /var (see above), by passing --tmpfs=/var.

        * Access modes specified in tmpfiles snippets may now be
          prefixed with "~", which indicates that they shall be masked
          by whether the existing file or directly is currently
          writable, readable or executable at all. Also, if specified
          the sgid/suid/sticky bits will be masked for all
          non-directories.

        * A new passive target unit "network-pre.target" has been
          added which is useful for services that shall run before any
          network is configured, for example firewall scripts.

        * The "floppy" group that previously owned the /dev/fd*
          devices is no longer used. The "disk" group is now used
          instead. Distributions should probably deprecate usage of
          this group.

        Contributions from: Camilo Aguilar, Christian Hesse, Colin Ian
        King, Cristian Rodríguez, Daniel Buch, Dave Reisner, David
        Strauss, Denis Tikhomirov, John, Jonathan Liu, Kay Sievers,
        Lennart Poettering, Mantas Mikulėnas, Mark Eichin, Ronny
        Chevalier, Susant Sahani, Thomas Blume, Thomas Hindoe Paaboel
        Andersen, Tom Gundersen, Umut Tezduyar Lindskog, Zbigniew
        Jędrzejewski-Szmek

        -- Berlin, 2014-06-11

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list