[PATCH] setproctitle: Do a deep environment copy instead of a shallow one
guillem at hadrons.org
Mon Jun 17 21:56:56 PDT 2013
On Tue, 2013-06-18 at 03:14:47 +0200, Jan Alexander Steffens (heftig) wrote:
> I've been seeing crashes at setproctitle.c:111 when Firefox
> dlopen()s libbsd (through getaddrinfo() and libnss_wins):
> 108 if (eq == NULL)
> 109 continue;
> 111 *eq = '\0'; /* <== SIGSEGV */
> 112 if (setenv(envcopy[i], eq + 1, 1) < 0)
> 113 error = errno;
> 114 *eq = '=';
> At first I believed that the clearenv() freeing all the environment
> strings was the cause, however even with the built-in spt_clearenv(),
> the crash happens. The reason why writing to the environment strings
> is dangerous is still unknown to me.
> Yet, doing a deep copy of the environment block instead of a shallow
> copy seems to paper over the crash, so this patch does that.
While discussing this off-list before the posting, I reached the
conlusion that dlopen()ing from a threaded program or library makes
this completely unreliable and prone to race conditions, because we
cannot guarantee nothing else has changed environ while we are handling
it (for example reallocated it, or any of its pointers to envvars).
Having thought about this, even the proposed patch might be prone to
data loss, as setenv() and family could be modifying and old environ
which has been replaced by our own.
I've been checking for a way to detect if the library has been
dlopen()ed so that the constructor would automatically disable itself,
but I cannot see how. dlinfo(3) looked promising but the lmid is the
same regardless of how we got loaded, and the link_map always lists
the handler as DT_NEEDED and present. Having checked the glibc dlopen(3)
implementation, I only see the __RTLD_DLOPEN flag as something
discriminating the load method, but does not seem to be tracked
afterwards. Anything else would need to rely on internal implementation
details which starts to get extremely ugly.
Something else that popped in my mind was to do an atomic exchange of
environ using a gcc builtin, but setenv() and family use working
pointers to environ while handling it, so it's of no use.
Yet another “possibility” could be to link libbsd.so with
“-Wl,-z,nodlopen” to declare it as incompatible with dlopen()ing, but
that would break the precise scenario this patch tried to fix.
In conclusion, if no one has any bright idea, I think I've to declare
defeat, and accept the fact that setproctitle() cannot be implemented
transparently as a shared library function. The next options could be
to expose a public setproctitle_init() function that needs to be
patched into the code just after entering main() (precisely the reason
I delayed adding any setproctitle() implementation as having to patch
the code kind of defeats the point of libbsd, but oh well), or maybe a
macro that when defined inserts the init function into the .init_array
from a header file. Hmm, or maybe split it out into its own
libbsd-spt.so.0 and mark that one as nodlopen.
I'll try to do a release today/tomorrow with some sort of hotfix, as
I started to get very uncomfortable during the 0.5.2 release given the
implications of a constructor that always messes with the environment,
even if the code ends up not using setproctitle(). And sorry for the
mess this has already caused on the multiple downstream distros. :/
More information about the libbsd