[Spice-devel] [PATCH 12/15] channel: do not free rcc->stream in red_channel_client_disconnect

Jonathon Jongsma jjongsma at redhat.com
Fri Dec 4 12:19:40 PST 2015


On Thu, 2015-12-03 at 16:27 +0000, Frediano Ziglio wrote:
> From: Marc-André Lureau <marcandre.lureau at gmail.com>
> 
> This fixes the following scary racy corruption after glib loop switch:
> 

Note that this valgrind trace is not valid for the current tree. Probably
commits got re-ordered and code got shifted around since this patch was made.
It's still probably useful to have the trace in the log, but we should at least
note that it's out-of-date. Perhaps we should also test whether this is still
reproducible with the current code.

> ==28173==
> ==28173== Debugger has detached.  Valgrind regains control.  We
> continue.
> ==28173== Invalid read of size 8
> ==28173==    at 0x4C7871E: reds_stream_read (reds.c:4521)
> ==28173==    by 0x4C2F9D7: red_peer_receive (red_channel.c:209)
> ==28173==    by 0x4C2FB59: red_peer_handle_incoming (red_channel.c:255)
> ==28173==    by 0x4C2FF36:
> red_channel_client_receive (red_channel.c:329)
> ==28173==    by 0x4C33D6D: red_channel_client_event (red_channel.c:1577)
> ==28173==    by 0x4C65098: watch_func (red_worker.c:10292)
> ==28173==    by 0x504DDAB: g_io_unix_dispatch (giounix.c:167)
> ==28173==    by 0x4FFACB6: g_main_dispatch (gmain.c:3065)
> ==28173==    by 0x4FFBA0D: g_main_context_dispatch (gmain.c:3641)
> ==28173==    by 0x4FFBBFF: g_main_context_iterate (gmain.c:3712)
> ==28173==    by 0x4FFC028: g_main_loop_run (gmain.c:3906)
> ==28173==    by 0x4C6ABF2: red_worker_main (red_worker.c:12180)
> ==28173==  Address 0x7d688e0 is 32 bytes inside a block of size 160
> free'd
> ==28173==    at 0x4A074C4: free (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==28173==    by 0x4C78A84: reds_stream_free (reds.c:4594)
> ==28173==    by 0x4C34E2E:
> red_channel_client_disconnect (red_channel.c:1865)
> ==28173==    by 0x4C30363:
> red_channel_client_default_peer_on_error (red_channel.c:417)
> ==28173==    by 0x4C3011B: red_peer_handle_outgoing (red_channel.c:372)
> ==28173==    by 0x4C3305F: red_channel_client_send (red_channel.c:1298)
> ==28173==    by 0x4C33FB6:
> red_channel_client_begin_send_message (red_channel.c:1616)
> ==28173==    by 0x4C5F4B4:
> display_begin_send_message (red_worker.c:8360)
> ==28173==    by 0x4C61DE8: display_channel_send_item (red_worker.c:9164)
> ==28173==    by 0x4C30C69:
> red_channel_client_send_item (red_channel.c:599)
> ==28173==    by 0x4C332BB: red_channel_client_push (red_channel.c:1351)
> ==28173==    by 0x4C33C3A:
> red_channel_client_handle_message (red_channel.c:1545)
> ---
>  server/red-channel.c | 39 +++++++++++++++++++++------------------
>  server/red-worker.c  |  2 ++
>  2 files changed, 23 insertions(+), 18 deletions(-)
> 
> diff --git a/server/red-channel.c b/server/red-channel.c
> index 2a64bc8..ae16b9c 100644
> --- a/server/red-channel.c
> +++ b/server/red-channel.c
> @@ -1238,22 +1238,25 @@ static void red_channel_client_ref(RedChannelClient
> *rcc)
>  
>  static void red_channel_client_unref(RedChannelClient *rcc)
>  {
> -    if (!--rcc->refs) {
> -        spice_debug("destroy rcc=%p", rcc);
> -        if (rcc->send_data.main.marshaller) {
> -            spice_marshaller_destroy(rcc->send_data.main.marshaller);
> -        }
> +    spice_debug("rcc=%p", rcc);
>  
> -        if (rcc->send_data.urgent.marshaller) {
> -            spice_marshaller_destroy(rcc->send_data.urgent.marshaller);
> -        }
> +    if (--rcc->refs != 0)
> +        return;
>  
> -        red_channel_client_destroy_remote_caps(rcc);
> -        if (rcc->channel) {
> -            red_channel_unref(rcc->channel);
> -        }
> -        free(rcc);
> -    }
> +    reds_stream_free(rcc->stream);
> +    rcc->stream = NULL;
> +
> +    if (rcc->send_data.main.marshaller)
> +        spice_marshaller_destroy(rcc->send_data.main.marshaller);
> +
> +    if (rcc->send_data.urgent.marshaller)
> +        spice_marshaller_destroy(rcc->send_data.urgent.marshaller);
> +
> +    red_channel_client_destroy_remote_caps(rcc);
> +    if (rcc->channel)
> +        red_channel_unref(rcc->channel);
> +
> +    free(rcc);
>  }
>  
>  void red_channel_client_destroy(RedChannelClient *rcc)
> @@ -1758,7 +1761,7 @@ void red_channel_pipes_add_empty_msg(RedChannel
> *channel, int msg_type)
>  int red_channel_client_is_connected(RedChannelClient *rcc)
>  {
>      if (!rcc->dummy) {
> -        return rcc->stream != NULL;
> +        return (rcc->stream != NULL) && ring_item_is_linked(&rcc
> ->channel_link);

This change seems harmless, but it doesn't seem strictly related to the bug
being fixed.

>      } else {
>          return rcc->dummy_connected;
>      }
> @@ -1813,8 +1816,10 @@ static void red_channel_remove_client(RedChannelClient
> *rcc)
>                        rcc->channel->type, rcc->channel->id,
>                        rcc->channel->thread_id, pthread_self());
>      }
> +    spice_return_if_fail(ring_item_is_linked(&rcc->channel_link));
> +
>      ring_remove(&rcc->channel_link);
> -    spice_assert(rcc->channel->clients_num > 0);
> +    spice_return_if_fail(rcc->channel->clients_num > 0);

Same with these.


>      rcc->channel->clients_num--;
>      // TODO: should we set rcc->channel to NULL???
>  }
> @@ -1854,8 +1859,6 @@ void red_channel_client_disconnect(RedChannelClient
> *rcc)
>          rcc->channel->core->watch_remove(rcc->stream->watch);
>          rcc->stream->watch = NULL;
>      }
> -    reds_stream_free(rcc->stream);
> -    rcc->stream = NULL;
>      if (rcc->latency_monitor.timer) {
>          rcc->channel->core->timer_remove(rcc->latency_monitor.timer);
>          rcc->latency_monitor.timer = NULL;

So, it seems that we were freeing the stream in _disconnect() before removing
the watch, and we might still receive one more 'watch' event after freeing the
stream. 

To solve the problem, the stream is not freed until later (in _unref()). This
means that we're likely still getting an 'extra' client event after
disconnecting the client. We won't crash anymore since the stream is still valid
at this point, but it's not clear whether it's a good idea to process that last
event after disconnection or not.

By the way, it seems that this comment (red-channel.c:244) may be related to the
same issue that this patch attempts to solve:

    /* XXX: This needs further investigation as to the underlying cause, it
happened
     * after spicec disconnect (but not with spice-gtk) repeatedly. */
    if (!stream) {
        return;
    }



> diff --git a/server/red-worker.c b/server/red-worker.c
> index 9ec90ce..ba3454e 100644
> --- a/server/red-worker.c
> +++ b/server/red-worker.c
> @@ -606,6 +606,8 @@ static gboolean watch_func(GIOChannel *source,
> GIOCondition condition,
>      SpiceWatch *watch = data;
>      int fd = g_io_channel_unix_get_fd(source);
>  
> +    spice_return_val_if_fail(!g_source_is_destroyed(watch->source), FALSE);
> +

I'm curious about whether this was added out of excessive caution, or whether it
was ever triggered. I guess it probably wasn't triggered since
spice_return_val_if_fail() actually aborts at the moment...


>      watch->func(fd, giocondition_to_spice_event(condition), watch->rcc);
>  
>      return TRUE;


More information about the Spice-devel mailing list