[systemd-devel] [PATCH 2/3] nspawn: use Barrier API instead of eventfd-util

David Herrmann dh.herrmann at gmail.com
Thu Jul 17 02:30:26 PDT 2014


Hi

On Mon, Jul 14, 2014 at 3:28 AM, Djalal Harouni <tixxdz at opendz.org> wrote:
> ppoll is atomic and it is handled by the kernel, so perhaps
> setting/restoring sigmask can be done easily! and for nspawn: IMO we need
> to receive SIGCHLD which implies EINTR.
>
> I say EINTR since not only for blocking read or infinite poll, but
> perhaps for all the other functions that the parent may do to setup the
> environment of the container, currently nspawn will set network
> interfaces before moving them into the container, it will also register
> the machine, and perhaps other operations...
>
> So having EINTR errors is useful here not only for direct reads, but for
> all the other calls that might block! IOW I think that nspawn should
> have an empty sig handler for SIGCHLD.
>
> Barrier reads already use poll and pipe to handle remote abortion since
> it can *not* be done by eventfd, yes this is perfect but for nspawn we
> can also achieve the same by combining eventfd and SICCHLD!
>
> What do you think if we make Barrier use:
> eventfd+pipe and/or eventfd+SIGCHLD ?
>
> Most complex fork/clone code should receive SIGCHLD, and think about
> nspawn! we do want it to be as lightweigh as possible, having 4 fds by
> default (2 eventfd + heavy pipe) may hit some resource limits quickly!
>
> compared to: 2 eventfd + empty sig handler!

My first attempt was to use a signalfd on SIGCHLD + edge-triggered. If
I don't read from the signalfd and only use it to wake up and wall
waitid(WNOWAIT), I won't interfere with other signalfds. However, this
wasn't really more lightweight than the pipe-method so i ditched it.

Regarding dropping the pipe: pipe2() is _really_ fast. I mean, we're
fork()ing and running like thousands of syscalls just during container
setup. I cannot see how dropping one light pipe2 call is beneficial
here? We also destroy the pipe before running the real container. So
it's really just during setup.

> And it seems from the patch you are not checking barrier_place() return
> code, if the remote aborted ?

That's fine. Abortions are remembered and the later barrier_sync()
call will return immediately.

> Thanks for the patches, sure the API is really nice, I'll try to comment
> on #1

Thanks!
David


More information about the systemd-devel mailing list