[systemd-devel] new user/group population on bootup

Fri Jun 13 09:26:23 PDT 2014

On Fri, 13.06.14 05:36, Colin Walters (walters at verbum.org) wrote:

> Hi,
> 
> I had a quick look at the new:
> http://cgit.freedesktop.org/systemd/systemd/commit/?id=1b99214789101976d6bbf75c351279584b071998
> and followon commits.
> 
> My high level takeaway right now is that this looks OK for nspawn
> containers, but it's not clear to me it's viable or right for the host
> OS, at least for general purpose systems.

Well, we ultimately do try to cover general purpose systems with this,
even though initially container and embedded systems are the
easiest-to-reach goal.

> * Is the vision of an empty /etc right for the host?  It's not clear to
> me...if you look at a typical general-purpose system, there's...lots of
> stuff in there =)  We'd be talking about lots of work across many
> projects.  Particularly stuff like PAM.

Well, PAM I'd like to see fixed in the long run, so that it falls back
to files in /usr/lib/pam.d/ if it finds nothing in /etc/pam.d/ for a
specific security services.

But yes, obviously there are currently quite a few things in /etc that
we cannot just drop, emptying /etc of unessential bits is a long-term
goal, not a short-term one. For example all the databases like
/etc/protocols or /etc/services would have to be moved to /usr, too,
first.

I am not necessarily suggesting that general purpose distros like Fedora
or Debian come with an empty /etc on installation. However, what I care
about is that /etc *can* be empty. And I believe that that's actually
useful all across the board, from embedded, over mobile, over desktop,
over servers, to containers. 

For desktops this is useful if we want to care for chromeos-like setups
where the OS image needs to be verified, and hence stateful /etc is
not possible, since it cannot be validated. In that case you either have
to make /etc read-only, or mostly empty, and rebuildable on boot. I find
the latter a lot more interesting since that way we can centralize
vendor data in /usr, and boot off /usr and everything else is magically
reconstructed on every boot.

For the container case this is useful, so that we can easily run a
hundred instances of the same "golden" /usr tree, and can make sure we
can update /usr and get all changes to /etc applied after each container
is rebooted. Also, we want stateless containers, that are flushed out on
shutdown.

Then, I want this for more minimalist installers, that basically consist
of formatting a harddisk, deserializing /usr into it as one unit, and
then rebooting.

Then, I want this for supporting a "factory reset" switch, like many
mobile and embedded devices have it (such as Android), where the
persistent state is flushed out, but the OS itself is kept. This would
mean /etc and /var are formatted, but /usr stays.

I figure what might make sense for general purpose distros like Fedora
is to keep the unmodified vendor /etc in a read-only way in
/usr/share/etc, but keep installing it also onto /etc. That way, things
will look initially like they always did (but admins get the ability to
diff the vendor configuration with their own one). And then, the people
that are interested in the "factory reset" logic can reset /etc and /var
as they want. When they do that, they will not end up with the exact 1:1
original /etc, but a very minimal one. However, they can also return to
the full vesion again by doing "cp -an /usr/share/etc/ /etc/".

But even that is quite a bit off, as this would require changes in RPM
to a) implicitly keep the original config files around in
/usr/share/etc, and b) to be able to reconstruct the RPM database from
data in /usr only -- at least if rpm shall remain usable after such a
factory reset.

But anyway, there's no doubt that all of this will take a long time to
be finished all across the board, and I have no doubt that many packages
will be unlikely to work with this scheme any time soon, but i also
believe we need to start one day with this, and that for the important,
core packages we are pretty close to make this a reality. And I am also
sure that for the interesting packages, people will pressure packagers
to adapt to this scheme quickly, as soon as they notice how awesome
things like the "golden /usr" multi-instantiation container thing
actually is.

> What OSTree does is a "rebase" of /etc across updates.  It takes the
> *new default* /etc from /usr/etc (could be /usr/share/etc), and then
> applies your *changes* on top (on a whole-file basis, it doesn't attempt
> semantic merging). The essential property here is that you get new
> default config files.  Even if we're headed towards cleaning /etc,
> there's still going to be software that adds new files there for a while
> - a solution for a general-purpose OS is going to have to handle this.

Yupp, i can agree that it has benefits that a replication results in the
exact same /etc as an RPM install. However, I also think that it would
be a lot cleaner if /etc was minimal in the end.

> Are you thinking of removing the config files in /etc/systemd/ like
> logind.conf?

Depends on what you mean.

They will be there in the RPM and be installed by default if you do an
RPM install, as always.

However, if you do a "golden /usr" install, or do a "factory reset" then
they will not be recreated, as they actually are unnecessary, and logind
will just use the defaults if it is missing.

I think having this distinction is OK actually, as a "toolbox"-style OS
like Fedora should probably come with configuration files in place,
because it traditionally is an OS admins love to configure the hell out
of. However, for the "golden /usr", "factory reset" or "verified /usr"
cases, that's really unnecessary, and hence is not necessary to
replicate.

> * Static versus dynamic UID allocation: OSTree curently replicates
> numeric uid/gids, and so does other "image-like" update systems like
> Omaha or "btrfs send".  The bits committed to systemd use dynamic
> allocation.  Now, that's OK as long as the service doesn't have files
> owned by that uid/gid in /usr or /etc.

Actually, the bits we commited know three kinds of allocation:

1) fully dynamic, by automatically finding the first free UID/GID and
   using that. With the default sysusers snippet we install this is done
   for most users/groups.

2) fully static, a fixed UID/GID is specified for a system user/group to
   created. Our default sysuers snippet does this for the "nobody" and
   "tty". The latter is necessary because we need to know the UID very
   very early on when mounting /dev/pts, and we can't do NSS that early.

3) based on stat() data of a specified file. The UID/GID is read from a
   specified file, and if otherwise unused is used for the new user.

The latter should solve the problem you describe, as long as there's a
file in /usr owned by the user to create.

Also note that #3 and or #2 gracefully degrade to #1 should the UID/GID
already be used otherwise.

> polkit is an example of software that is currently dynamically
> allocated, and has /etc/polkit-1/rules.d owned by that user.  We should
> think about whether anyone doing FS-level updates needs to somehow
> integrate with systemd's sysusers to chown() files, or whether we need
> to ban files in /etc and /usr owned by dynamic uids.  The discussion on
> fedora-devel-list was here:
>
> https://lists.fedoraproject.org/pipermail/devel/2014-April/197819.html

Well, I think files owned by users in /etc is not really a problem,
since /etc also contains /etc/passwd. Hence, when somebody can create a
file or directory in /etc and want it owned by some user, then he can
also add that user first with the right uid.

I also don't think files owned by users in /var are a problem, since
flushing /etc without also flushing /var is something that seems a
useless excercise, while the opposite (i.e. flushing /var without /etc)
makes a lot more sense, but in that case you have to recreate /var
anyway, hence you can pick any uid/gid of your choice for that, so we
have no problem.

Which leaves files/directory in /usr owned by some uid/gid. And for that
case we have the syntax mentioned in #3 above, where we can read the
uid/gid to use from the owner of an existing file in the fs.

I hope that makes any sense?

Lennart

-- 
Lennart Poettering, Red Hat