[pulseaudio-discuss] [PATCH] protocol-native: Fix srbchannel double-freeing
David Henningsson
david.henningsson at canonical.com
Thu Sep 18 03:09:47 PDT 2014
On 2014-09-18 11:40, Tanu Kaskinen wrote:
> On Thu, 2014-09-18 at 09:21 +0200, David Henningsson wrote:
>>
>> On 2014-09-18 09:08, Tanu Kaskinen wrote:
>>> On Tue, 2014-09-16 at 16:41 +0200, David Henningsson wrote:
>>>>
>>>> On 2014-09-16 13:22, Tanu Kaskinen wrote:
>>>>> On Tue, 2014-09-16 at 08:34 +0200, David Henningsson wrote:
>>>>>>
>>>>>> On 2014-09-14 21:06, Tanu Kaskinen wrote:
>>>>>>> I was able to trigger a crash with some unfinished code that caused
>>>>>>> a protocol error. The error was handled while
>>>>>>> pa_pstream_set_srbchannel() had not yet finished, so
>>>>>>> pa_native_connection.srbpending and pa_pstream.srbpending were both
>>>>>>> non-NULL. During the error handling, native_connection_unlink() freed
>>>>>>> the srbchannel, and then pa_pstream_unlink() tried to free it for the
>>>>>>> second time, causing an assertion error in mainloop_io_free().
>>>>>>
>>>>>> Ok, thanks for finding!
>>>>>>
>>>>>>> ---
>>>>>>> src/pulsecore/protocol-native.c | 14 +++++++++++++-
>>>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/src/pulsecore/protocol-native.c b/src/pulsecore/protocol-native.c
>>>>>>> index 6ec65d6..012c41a 100644
>>>>>>> --- a/src/pulsecore/protocol-native.c
>>>>>>> +++ b/src/pulsecore/protocol-native.c
>>>>>>> @@ -2639,13 +2639,25 @@ static void setup_srbchannel(pa_native_connection *c) {
>>>>>>>
>>>>>>> static void command_enable_srbchannel(pa_pdispatch *pd, uint32_t command, uint32_t tag, pa_tagstruct *t, void *userdata) {
>>>>>>> pa_native_connection *c = PA_NATIVE_CONNECTION(userdata);
>>>>>>> + pa_srbchannel *tmp;
>>>>>>>
>>>>>>> if (tag != (uint32_t) (size_t) c->srbpending)
>>>>>>> protocol_error(c);
>>>>>>>
>>>>>>> pa_log_debug("Client enabled srbchannel.");
>>>>>>> - pa_pstream_set_srbchannel(c->pstream, c->srbpending);
>>>>>>> +
>>>>>>> + /* We must set c->sbpending to NULL before calling
>>>>>>> + * pa_pstream_set_srbchannel(), because otherwise there is a time window
>>>>>>> + * when both pa_native_connection and pa_pstream think they own the
>>>>>>> + * srbchannel. pa_pstream_set_srbchannel() may read and dispatch arbitrary
>>>>>>> + * commands from the client, and if a protocol error occurs, the connection
>>>>>>> + * is torn down, and if c->sbpending has not yet been set to NULL here, the
>>>>>>> + * srbchannel will be double-freed. XXX: Can we somehow avoid reading and
>>>>>>> + * dispatching commands in pa_pstream_set_srbchannel()? Processing commands
>>>>>>> + * recursively seems prone to bugs. */
>>>>>>> + tmp = c->srbpending;
>>>>>>> c->srbpending = NULL;
>>>>>>> + pa_pstream_set_srbchannel(c->pstream, c->srbpending);
>>>>>>
>>>>>> You probably mean "tmp" here instead of "c->srbpending"...
>>>>>
>>>>> *facepalm*
>>>>>
>>>>>> Anyhow, I think it would be better to try to resolve your concern by
>>>>>> implementing a deferred event. Because, as you say, processing commands
>>>>>> recursive seems prone to bugs. I e, pa_srbchannel_set_callback would not
>>>>>> call srbchannel_rwloop, but instead set up a deferred event which would
>>>>>> just call srbchannel_rwloop.
>>>>>>
>>>>>> Does that make sense?
>>>>>
>>>>> Hard to say, because it's not clear to me why srbchannel_rwloop() is
>>>>> called in the first place. If it's moved to a defer event, there will be
>>>>> "stuff" happening between setting the callback and dispatching the defer
>>>>> event. It's not clear to me what that "stuff" can include. Can it
>>>>> include something that should not be done before srbchannel_rwloop() has
>>>>> been called?
>>>>
>>>> srbchannel_rwloop is called to handle this potential scenario:
>>>>
>>>> 1) peer A sends srbchannel switch command to peer B
>>>> 2) peer A writes something to the srbchannel
>>>> 3) peer B receives srbchannel command and sets up the srbchannel
>>>> 4) If srbchannel_rwloop is not called by peer B at this point, the
>>>> message written in 2) will not be picked up by peer B in a timely fashion.
>>>
>>> Why is it not picked up in a timely fashion? You have an fd that will
>>> normally signal POLLIN when there is data to be read. Is there some
>>> reason that doesn't work if there is data to be read already at
>>> initialization time?
>>
>> As an optimisation, the pa_fdsem layer makes sure that nothing is
>> written to the fd, if nothing is waiting on the other side. And the io
>> event waits for things to be written to the fd.
>
> Ok. I think the patch is good. It would be nice to have a comment
> explaining what you explained above, though.
Thanks, pushed now (after adding a comment).
--
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic
More information about the pulseaudio-discuss
mailing list