[systemd-devel] BUG: several bugs in core/main.c (v218)

Thu Jan 29 12:54:57 PST 2015

On 01/29/2015 03:16 AM, Zbigniew Jędrzejewski-Szmek wrote:
>>> 2. SIGSEGV should be catched by a handler running on a separate
>>> stack (SA_ONSTACK) - otherwise it can cause segfault itself, when
>>> the first SIGSEGV which triggered the handler is caused by stack
>>> overflow.
>> I'd take a patch for this, but in general: the crash handler is
>> nothing that is supposed to recover your system. All it does its close
>> all fds, fork, and coredump in the child while freezing in the
>> parent. That makes sure that all communication with PID 1 is severed
>> (so that clients talking to it don't hang), we get a clean coredump,
>> but the kernel doesn't oops immediately. That's really all there is to
>> it.
>>
>> So far we never ran into stack overflow issues, hence nobody was ever
>> tempted to cover that case too, and set up a separate stack. In
>> particular since doing this without testing it is pointless...
>>
>> --->
>> Actually, when the program is a critical part of the system, than ALL sig handlers should use separate stack (what in fact makes the life easier) - that's because it's completely unpredictable, whether there's enough stack space to execute the handler - i.e. each sig handler can cause segfault when it's executed at the bottom of the stack (where top/bootom is only a matter of naming convention)
> IIUC, you are saying that we could trigger for example a FPE, and
> then exhaust all the memory in the signal handler. Nah, we can ignore
> this possibility.
No, I'm saying, that the handler can be invoked when there's not enough 
space left on the main stack, so it'll do nothing besides causing sagfault.
>
>> The only effect of SA_NODEFER is that sig mask is not automatically set for a signal which triggered the handler.
>> This means nesting of signals, as the handler can be interrupted and another instance is started for a new signal context.  (and sig handlers are not signal-safe with SA_NODEFER, what means that they have to be fully re-entrant)
>> This also means, that there's completely no warranty that the handler won't be interrupted with external signal -> mess.
> Not really. systemd is single-threaded, so a signal could come only from two places:
> the signal handler could cause a fault, or it could be delivered from another program.
> The first is unlikely, stack handlers do a tiny amount of work, and the second is OK,
> because killing PID 1 is only allowed for privileged processes, and if someone can
> send arbitrary signals to PID 1, they can cause a failure in other ways.
>
> Zbyszek
Yeah, it's unlikely, but it would be safer to at least check the source 
of the signal.
But that's not really a problem, since the handler is not even 
re-entrant (f.e. due to dynamic memory allocation performed trough 
log_xxxx macros -> log_full_errno -> log_internal -> 
journal-send.c::_printf_() -> vasprintf() )

Whatever.

To summarise: there are no bugs in core/main.c:

1. Ignoring return value from sigaction is not a problem, because it's 
almost *unlikely* to fail.
   In any case, kernel will do the job, so it doesn't even matter 
whether the handlers are installed or not -> everything that can be done 
in the handler is to just quit anyway.

2. SIGSEGV handler (and others): it's *unlikely* to happen that the 
handler will be executed at the bottom of the main stack (with 
insufficient stack space or when the stack is already overflowed) - 
because stacks are huge, and in case of segfault see point 1.

3. SA_NODEFER is OK, cause it's *unlikely* that external signal gets 
delivered.  Even if  the crash() handler will crash (f.e. because it's 
not re-entrant), what (again) is *unlikely* to happen, see point 1.

I've learned a lot, sorry to waste Your time.

Regards.