[systemd-devel] Unable to mask /proc using currently available options (InaccessiblePaths...)
Lennart Poettering
lennart at poettering.net
Mon Apr 17 10:50:38 UTC 2017
On Wed, 12.04.17 18:27, Timothée Ravier (siosm99 at gmail.com) wrote:
> Hi,
>
> I would like to make the /proc directory inaccessible for some services.
> Unfortunately, adding the InaccessiblePaths=/proc option to a service unit will
> not work.
Hmm, what precisely do you intend to make unavailable here? Note that
/proc/self/ is kinda normal process API on Linux, as are some other
files, and a variety of calls (including in glibc defined ones) assume
that /proc is available, at least for read access.
It definitely makes sense to restrict /proc
somehwat. ProtectKernelTunables= will make /proc/sys read-only for
example, and there's work in progress to permit the kernel's hidepid
procfs mount option to be settable per mount point so that we can
expose it per-service in systemd, but I am not sure it is really
desirable to completely disable it — at least at a service level. It
might make sense to restrict it in even more restricted sandboxes
(for example, a web browser might restrict this if it uses per-page
renderer process sandboxes).
That all said, even if I don't see the great benefit of blocking the
entirety of /proc for a service, I'm still willing to merge changes to
make this work, if this helps you.
> With systemd v233, during the filesystem layout setup for the new service, an
> empty directory will be mounted on top of /proc first (in core:namespace.c:
> setup_namespace(): apply_mount()) and then mount points will be turned readonly
> (in core:namespace.c: setup_namespace(): make_read_only()), using
> /proc/mountinfo which is now unavailable. Thus this step will fail.
Maybe we can find a somewhat clean fall-back for this, when /proc is
not around?
Or maybe we slightly alter the logic here, and open
/proc/self/mountinfo before we rearrange the directories, and then
always only read from the already opened fd, and do not refer to the
actual file system anymore? I figure that would mean adding a version
of bind_remount_recursive() that takes a FILE* or so of
/proc/self/mountinfo as additional parameter, and then seeks to the
beginning before reading off it, if you follow what I mean? I think
this approach would be the nicest one.
> With systemd v233, it is possible to work around this issue leaving only a single
> /proc/self/mountinfo file available using this hack:
>
> $ umask 0277
> $ mkdir -p /.proc/self
> $ touch /.proc/self/mountinfo
>
> And in the unit:
>
> BindReadOnlyPaths=/.proc:/proc /proc/self/mountinfo:/.proc/self/mountinfo
>
> But this is not really pretty.
>
> I would like your opinion on the following suggestions before writing code:
> * Should I extend the MountVFSAPI option to support the case where the
> RootImage and RootDirectory options are not set?
How precisely would you alter the effect of MountVFSAPI= here?
> * Should I add a special HideProc option to support hiding /proc for
> conventional services?
As above, I'd prefer not to add this. I am not against making work
what you want to do, but I am not convinced that adding first class
config options for it would be a good idea, since systemd after all is
a service manager and hence we should focus on making things easy that
match the service usecase, but not more.
Or in other words: making InaccessiblePaths=/proc work sounds
preferable to me.
> As a side note, debug logs in core/namespace.c are non functional. A call to
> log_open() appears to be missing.
Yupp, this is known. But opening fds comes with other issues (in
particular because seccomp and other security systems would need
preparation to permit that), hence currently we just keep the code in
there, and it is normally a NOP, except if you hack around, turn it on
manually, by adding a log_open for your local compilation.
Lennart
--
Lennart Poettering, Red Hat
More information about the systemd-devel
mailing list