[PATCH wayland v4 08/11] client: Replace the singleton zombie with bespoke zombies

Simon McVittie smcv at collabora.com
Wed Jan 10 17:47:17 UTC 2018


On Wed, 10 Jan 2018 at 11:03:03 -0600, Derek Foreman wrote:
> I suspect 100% of the software I work on on a daily basis will explode in
> completely unpredictable and undiagnosable ways in response to a malloc()
> failure anyway

Does anyone test these code paths in Wayland? If so, how? (Genuine
questions, I'm interested in the answers.)

I ask because the original authors of libdbus wrote it thinking that
they had handled OOM conditions, at significant complexity cost,
then later added infrastructure to simulate malloc() failures during
automated testing and discovered that a significant fraction of them
were mishandled (Havoc estimates "at least 5%" in [1]). Next month that
test infrastructure will be 15 years old, and I'm *still* semi-regularly
finding bugs in pre-existing code where malloc() failures are mishandled.

If those code paths are never tested (and therefore probably don't work),
you might find that you get more reliable software by declining to handle
malloc() failures (usually done with an "xmalloc()" wrapper that either
returns non-NULL or aborts), and as a consequence not exposing users to
bugs in code that only exists to be able to cope with malloc() failures.
Handling OOM is great if you can consistently write completely bug-free
code, but there is considerable anecdotal evidence that we can't.

libdbus still claims to cope with malloc() failures, because it has
become policy (and because system components that can't crash, like
Upstart, systemd and dbus-daemon, were meant to be able to rely on it);
but even for libdbus, and even with automated testing to catch some of
the failures, I'm far from convinced that it was a net positive to do so.

[1] https://blog.ometer.com/2008/02/04/out-of-memory-handling-d-bus-experience/

Regards,
    smcv


More information about the wayland-devel mailing list