[pulseaudio-discuss] [PATCH] protocol-native: Fix srbchannel double-freeing

Tanu Kaskinen tanu.kaskinen at linux.intel.com
Thu Sep 18 02:40:36 PDT 2014


On Thu, 2014-09-18 at 09:21 +0200, David Henningsson wrote:
> 
> On 2014-09-18 09:08, Tanu Kaskinen wrote:
> > On Tue, 2014-09-16 at 16:41 +0200, David Henningsson wrote:
> >>
> >> On 2014-09-16 13:22, Tanu Kaskinen wrote:
> >>> On Tue, 2014-09-16 at 08:34 +0200, David Henningsson wrote:
> >>>>
> >>>> On 2014-09-14 21:06, Tanu Kaskinen wrote:
> >>>>> I was able to trigger a crash with some unfinished code that caused
> >>>>> a protocol error. The error was handled while
> >>>>> pa_pstream_set_srbchannel() had not yet finished, so
> >>>>> pa_native_connection.srbpending and pa_pstream.srbpending were both
> >>>>> non-NULL. During the error handling, native_connection_unlink() freed
> >>>>> the srbchannel, and then pa_pstream_unlink() tried to free it for the
> >>>>> second time, causing an assertion error in mainloop_io_free().
> >>>>
> >>>> Ok, thanks for finding!
> >>>>
> >>>>> ---
> >>>>>     src/pulsecore/protocol-native.c | 14 +++++++++++++-
> >>>>>     1 file changed, 13 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/src/pulsecore/protocol-native.c b/src/pulsecore/protocol-native.c
> >>>>> index 6ec65d6..012c41a 100644
> >>>>> --- a/src/pulsecore/protocol-native.c
> >>>>> +++ b/src/pulsecore/protocol-native.c
> >>>>> @@ -2639,13 +2639,25 @@ static void setup_srbchannel(pa_native_connection *c) {
> >>>>>
> >>>>>     static void command_enable_srbchannel(pa_pdispatch *pd, uint32_t command, uint32_t tag, pa_tagstruct *t, void *userdata) {
> >>>>>         pa_native_connection *c = PA_NATIVE_CONNECTION(userdata);
> >>>>> +    pa_srbchannel *tmp;
> >>>>>
> >>>>>         if (tag != (uint32_t) (size_t) c->srbpending)
> >>>>>             protocol_error(c);
> >>>>>
> >>>>>         pa_log_debug("Client enabled srbchannel.");
> >>>>> -    pa_pstream_set_srbchannel(c->pstream, c->srbpending);
> >>>>> +
> >>>>> +    /* We must set c->sbpending to NULL before calling
> >>>>> +     * pa_pstream_set_srbchannel(), because otherwise there is a time window
> >>>>> +     * when both pa_native_connection and pa_pstream think they own the
> >>>>> +     * srbchannel. pa_pstream_set_srbchannel() may read and dispatch arbitrary
> >>>>> +     * commands from the client, and if a protocol error occurs, the connection
> >>>>> +     * is torn down, and if c->sbpending has not yet been set to NULL here, the
> >>>>> +     * srbchannel will be double-freed. XXX: Can we somehow avoid reading and
> >>>>> +     * dispatching commands in pa_pstream_set_srbchannel()? Processing commands
> >>>>> +     * recursively seems prone to bugs. */
> >>>>> +    tmp = c->srbpending;
> >>>>>         c->srbpending = NULL;
> >>>>> +    pa_pstream_set_srbchannel(c->pstream, c->srbpending);
> >>>>
> >>>> You probably mean "tmp" here instead of "c->srbpending"...
> >>>
> >>> *facepalm*
> >>>
> >>>> Anyhow, I think it would be better to try to resolve your concern by
> >>>> implementing a deferred event. Because, as you say, processing commands
> >>>> recursive seems prone to bugs. I e, pa_srbchannel_set_callback would not
> >>>> call srbchannel_rwloop, but instead set up a deferred event which would
> >>>> just call srbchannel_rwloop.
> >>>>
> >>>> Does that make sense?
> >>>
> >>> Hard to say, because it's not clear to me why srbchannel_rwloop() is
> >>> called in the first place. If it's moved to a defer event, there will be
> >>> "stuff" happening between setting the callback and dispatching the defer
> >>> event. It's not clear to me what that "stuff" can include. Can it
> >>> include something that should not be done before srbchannel_rwloop() has
> >>> been called?
> >>
> >> srbchannel_rwloop is called to handle this potential scenario:
> >>
> >>    1) peer A sends srbchannel switch command to peer B
> >>    2) peer A writes something to the srbchannel
> >>    3) peer B receives srbchannel command and sets up the srbchannel
> >>    4) If srbchannel_rwloop is not called by peer B at this point, the
> >> message written in 2) will not be picked up by peer B in a timely fashion.
> >
> > Why is it not picked up in a timely fashion? You have an fd that will
> > normally signal POLLIN when there is data to be read. Is there some
> > reason that doesn't work if there is data to be read already at
> > initialization time?
> 
> As an optimisation, the pa_fdsem layer makes sure that nothing is 
> written to the fd, if nothing is waiting on the other side. And the io 
> event waits for things to be written to the fd.

Ok. I think the patch is good. It would be nice to have a comment
explaining what you explained above, though.

-- 
Tanu



More information about the pulseaudio-discuss mailing list