[systemd-devel] PrivateTmp and hugepages

Albert Strasheim fullung at gmail.com
Wed Apr 25 08:02:59 PDT 2012


Hello all

We'd like to launch some processes in a private mount namespace so
that they can each use a limited amount of private hugepages without
running as root.

The idea was to use PrivateTmp=true to get systemd to call unshare for
us and then configure the service with:

PermissionsStartOnly=true
ExecStartPre=/bin/mount -t hugetlbfs none /dev/hugepages -o
'size=2G,pagesize=2M'

The nice thing about this is that you could configure the amount of
hugepages a service gets using an EnvironmentFile.

At this point we would also have to set permissions on the hugepages
mount point so that the service's user can read/write files in the
hugepages directory. I don't know if the permissions changes to the
mount point directory will be visible outside the mount namespace?

Anyway, we ran into some other issues before we got here:

1. systemd doesn't seem to clean up the /tmp/systemd-namespace-*
directories when a service exits.

2. The operations for setting up a PrivateTmp doesn't seem to work if
systemd is running directly inside an initramfs. We see:

unshare(CLONE_NEWNS) = 0
mount(NULL, "/", NULL, MS_REC|MS_SLAVE, NULL) = 0
mount("/", "/tmp/systemd-namespace-yqotDP/root/", NULL,
MS_BIND|MS_REC, NULL) = -1 EINVAL (Invalid argument)

Something seems to go wrong here. Any idea why the bind mount doesn't
like a initramfs root?

This experience has also made me think that systemd could benefit from
a general Unshare= setting so that IPC, network and mount namespaces
can all be controlled for a service.

Any feedback appreciated.

Regards

Albert


More information about the systemd-devel mailing list