[Spice-devel] [PATCH spice-server 5/8] stream-device: Workaround Qemu bug closing device

Christophe Fergeau cfergeau at redhat.com
Fri Feb 23 16:06:11 UTC 2018


On Sun, Feb 18, 2018 at 06:58:11PM +0000, Frediano Ziglio wrote:
> Previous patch causes a bug into Qemu if the patch

"in Qemu" rather than "into"

> 46764fe09ca2e0f15c0981a672c166ed8cf57e72 ("virtio-serial: fix segfault
> on disconnect") is not include in that version of Qemu.

"included". I'd add a mention that this patch is present in Qemu 2.10.0
and newer.

> This crash happens when device is closed inside a write operation.

"when the device is closed during a write operation"

> For SPICE character device a spice_server_char_device_wakeup is called

"For SPICE character devices, spice_server_char_device_wakeup() is
called ..."

> to write data which handles both read and write pending operations.
> As we want to close the device but we can't do it inside the handler

"we can't do it inside the handler without causing a crash, this commit
schedules a timer ..."

> operation schedule a timer that will close the guest device outside
> this callback.
> The proper solution would be to patch Qemu but making sure this
> is not so easy so this workaround patch.

"making sure of this is not so easy, hence this workaround in
spice-server"

> Code is marked with some comments to remember to remove this
> hack in a safe future.
> 
> Signed-off-by: Frediano Ziglio <fziglio at redhat.com>
> ---
>  server/stream-device.c | 34 ++++++++++++++++++++++++++++++----
>  1 file changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/server/stream-device.c b/server/stream-device.c
> index 688a4e07..170c8637 100644
> --- a/server/stream-device.c
> +++ b/server/stream-device.c
> @@ -57,6 +57,7 @@ struct StreamDevice {
>      bool flow_stopped;
>      StreamChannel *stream_channel;
>      CursorChannel *cursor_channel;
> +    SpiceTimer *close_timer;
>  };
>  
>  struct StreamDeviceClass {
> @@ -68,6 +69,8 @@ static StreamDevice *stream_device_new(SpiceCharDeviceInstance *sin, RedsState *
>  
>  G_DEFINE_TYPE(StreamDevice, stream_device, RED_TYPE_CHAR_DEVICE)
>  
> +static void char_device_set_state(RedCharDevice *char_dev, int state);
> +
>  typedef bool StreamMsgHandler(StreamDevice *dev, SpiceCharDeviceInstance *sin)
>      SPICE_GNUC_WARN_UNUSED_RESULT;
>  
> @@ -77,6 +80,16 @@ static StreamMsgHandler handle_msg_format, handle_msg_data, handle_msg_cursor_se
>  static bool handle_msg_invalid(StreamDevice *dev, SpiceCharDeviceInstance *sin,
>                                 const char *error_msg) SPICE_GNUC_WARN_UNUSED_RESULT;
>  
> +static void
> +close_timer_func(void *opaque)
> +{
> +    StreamDevice *dev = opaque;
> +
> +    if (dev->opened && dev->has_error) {
> +        char_device_set_state(RED_CHAR_DEVICE(dev), 0);
> +    }
> +}
> +
>  static bool
>  stream_device_partial_read(StreamDevice *dev, SpiceCharDeviceInstance *sin)
>  {
> @@ -94,7 +107,17 @@ stream_device_partial_read(StreamDevice *dev, SpiceCharDeviceInstance *sin)
>          while (sif->read(sin, buf, sizeof(buf)) > 0) {
>              continue;
>          }
> -        sif->state(sin, 0);
> +
> +        // This code is a workaround for a Qemu bug, see patch
> +        // "stream-device: Workaround Qemu bug closing device"

Add a dot at the end of this line

> +        // as calling sif->state here can cause a crash schedule

"As calling sif->state here can cause a crash, schedule a timer and do
the call in it"

> +        // a timer and do the call in it. Remove this code when
> +        // will be sure all Qemu versions has been patched.

"when we are sure all Qemu versions have been patched"
(this will never happen, though when the 2 latest releases of main
distros don't have the bug in their QEMU, then I'd say we can remove
this).

> +        RedsState *reds = red_char_device_get_server(RED_CHAR_DEVICE(dev));
> +        if (!dev->close_timer) {
> +            dev->close_timer = reds_core_timer_add(reds, close_timer_func, dev);
> +        }
> +        reds_core_timer_start(reds, dev->close_timer, 0);

NB: I have some bad memories of g_timeout_add() not doing the right
thing at all when called with a timeout of 0ms. We'd need g_idle_add()
instead here, but that's not something which our interface allows
(this is just a comment in passing, not something we can act on right
now).
This makes me wonder if this could be racy though, if we get some writes
to the device before this delayed callback runs?

Christophe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/spice-devel/attachments/20180223/1b49db62/attachment.sig>


More information about the Spice-devel mailing list