[pulseaudio-discuss] Design constraints for per-client mempools

David Henningsson david.henningsson at canonical.com
Tue Feb 23 10:06:16 UTC 2016



On 2016-02-23 06:55, Ahmed S. Danish wrote:
> Hello everyone,
>
> As you probably know by now, the second patch in the memfd series
> was created to transform the global, shared by all clients,
> srbchannel mempool to a "per-client" one. [1]
>
> Reason of this transition is to avoid data leaks between clients.
> In a follow-up patch series, it's even hoped that the global core
> mempool will also be transformed to the per-client model. [2]
>
>
> ==> Current problem:
> --------------------
>
> By injecting faults at the system-call level, it was discovered
> that current design of per-client mempools leads to a reproducible
> memory fault on certain error-recovery paths.
>
> The fault is caused by a design issue, rather than a small coding
> error.
>
>
> ==> Why are you dedicating a full mail to this?
> -----------------------------------------------
>
> The problematic issue discovered implies that a bigger design
> change is required; your input is very much needed :-)
>
>
> ==> What was the current design?
> --------------------------------
>
> To do the per-client mempool transformation, a simple decision
> was taken: let 'pa_native_connection' own the per-client mempool
> in question.
>
> Why? Because when a client connects, a 'native connection' object
> is created for it by default. Inside this object is the pstream
> pipe and other per client resources. So this seemed like the best
> place for a per-client mempool:
>
>    /* protocol-native.c - Called for each client connection */
>    void pa_native_protocol_connect(...) {
>        pa_native_connection *c;
>
>        ...
>        c = pa_msgobject_new(pa_native_connection);
>        c->pstream = pa_pstream_new(...);
>        c->mempool = pa_mempool_new(...);    // per-client mempool
>        ...
>    }
>
> And 'naturally', the pool should be deallocated when the
> connection closes:
>
>    /* protocol-native.c */
>    native_connection_free(pa_object *o) {
>        ...
>        pa_pdispatch_unref(c->pdispatch);
>        pa_pstream_unref(c->pstream);
>
>        // Srbchannels allocates from this per-client pool,
>        // so free it only after the pstream is freed
>        pa_mempool_free(c->mempool);
>        ...
>    }
>
> All looked fine and dandy ..
>
>
> ==> And where is the problem exactly?
> -------------------------------------
>
> By injecting failures in sendmsg(2), thus reaching up to the
> iochannel and pstream layers, the following leak is shown:
>
>    E: Memory pool destroyed but not all mem blocks freed! 1 remain
>
> and a segfault is then produced due to a use-after-free operation
> on the leaked memblock.
>
> The sequence of failure, bottom-up, is as follows:
>
>    -> sendmsg(2) failes -EACCES
>      -> pa_iochannel_write_with_creds() return failure
>        -> pstream do_write() fails
>          -> do_pstream_read_write() fails, jumps to error recovery
>
> And the error recovery goes as this:
>
>    -> pstream die callback (p->callback) is called
>    -> That's protocol-native.c pstream_die_callback()
>      -> native_connection_unlink    // Stops srbchannel, etc.
>        -> pa_pstream_unlink         // Reset all pstream callbacks
>        -> native_connection_free    // Per-client mempool _freed!_
>    -> pa_pstream_unlink             // Second time, does nothing..
>    -> pa_pstram_free
>      -> pa_queue_free(p->send_queue, item_free)
>        -> 1. PACKET item found .. safely removed
>        -> 2. MEMBLOCK item found .. from freed srbchannel mempool
>              BOOM!!! segfault
>
> SUMMARY : As seen above, a per-client mempool's lifetime must be
> a superset of the pstream's lifetime. Putting the per-client pool
> in the native_connection object provided only an _illusion_ of
> such a superset. [*]
>
> During error recovery, stale memblocks remain in pstrem's send
> queue long after __all the per-client objects__ has been removed.
>
> [*] Check native_connection_free() under "What was the current
>      design?" for why this illusion was maintained.
>
>
> ==> Suggested solutions?
> ------------------------
>
> The situation a is bit tricky.
>
> Where can we put the per-client pool while insuring a correct
> lifecycle – especially during error recovery?
>
> The safest option, it seems, is to let the per-client pool be
> owned by the pstream object, thus the pool can be freed at the
> very end of pstream_free() itself. Is such a solution acceptable?

Having read your detailed problem description and Tanu's answers, I 
think making mempools reference counted seems like the safest option.

 From a quick glance it does not look like mempools own anything 
reference counted (right?), so we shouldn't have a problem with cycles 
if we did that refcount.

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic


More information about the pulseaudio-discuss mailing list