Forcing initial Frame Transmission for dmabuf encoding on SPICE display channel connection
Michael Scherle
michael.scherle at rz.uni-freiburg.de
Wed May 14 12:11:03 UTC 2025
Hello,
ok I have implemented this.
Due to the existing code structure, it was difficult to make a draw that
does not also trigger the callback. I would have liked to attach the
meta data to the draw to be able to identify it at the end, but that
would have required major code changes. Now I have ended up with the
following:
https://gitlab.freedesktop.org/spice/spice/-/merge_requests/238
Description:
Send initial draw on client connect for DMA-BUF encoder
Ensure an initial draw is sent immediately upon client connection
if the DMA-BUF video encoder is in use. Without this, a frame would
only be pushed when the screen content changes and the GPU renders
a new frame—which could take a while, causing delays for the client.
GStreamer now receives a duplicated file descriptor (fd) and takes
responsibility for closing it. This guarantees the buffer remains
available long enough. The previous fd mechanism is still retained
to ensure a valid scanout buffer is always available for a new
client.
Greetings,
Michael
On 05.05.25 14:33, Michael Scherle wrote:
> Hello,
>
> Sorry for the late reply — I was on vacation. Thank you very much for
> the detailed explanation; it gave me a clear direction on how to
> approach the problem.
>
> The race condition I was facing involved qemu calling
> spice_qxl_gl_scanout during the initial frame transmission. This caused
> the dma-buf to be closed while GStreamer was still using it. I resolved
> this by giving GStreamer a duplicate of the dma-buf file descriptor and
> letting it close it once it's done. I'm still keeping the original dma-
> buf fd in qxl_state so that one is always available for the initial
> frame transmission. The question now is whether this is an acceptable
> solution.
>
> With that, I have a working prototype. However, there are still a few
> things I need to improve before i can do a MR:
>
> 1. Callback (async_complete) handling: I obviously don’t want to call
> this for the initial frame transmission. Implementing a special case for
> this in the current code structure is a bit tricky. Either by passing a
> variable through, or perhaps storing it in the qxl_state. For the
> latter, however, I first need to better understand the thread, worker
> and pipe system and see if that is possible.
>
> 2. I'm not sure whether it's necessary to ensure that, when multiple
> clients are connected, only the newly connected one receives the new
> frame. It's also an interesting design choice to encode the frame
> separately for each connection.
>
> Best regards,
> Michael
>
>
> On 16.04.25 00:00, Frediano Ziglio wrote:
>> On Thu, Apr 10, 2025 at 3:18 PM Michael Scherle <
>> michael.scherle at rz.uni-freiburg.de> wrote:
>>
>>> Hello,
>>>
>>> I’ve encountered an issue with the new DMA-BUF -> video encoding feature
>>> in SPICE. When connecting, the first frame is only sent once the GPU
>>> renders a new frame. However, this can take quite some time if the VM is
>>> idle (e.g., sitting on the desktop), since the GPU only renders a new
>>> frame when something on the screen changes. To address this, I wanted to
>>> force a frame to be sent when the display channel is connected.
>>>
>>>
>> Which makes sense.
>>
>>
>>> My initial, naive attempt was to grab the latest DMA-BUF on the display
>>> channel's connection in the SPICE server, encode it, and send it.
>>> However, this led to race conditions and crashes—particularly when QEMU
>>> happened to perform a scanout at the same time, closing the DMA-BUF in
>>> the process.
>>>
>>> By "closing" do you mean calling close() function? No, we should have
>> ownership.
>> What exact race did you encounter?
>>
>>
>>> As a second approach, I modified the QXLInterface to pass the display
>>> channel on_connect event back to QEMU. I couldn’t find any existing
>>> mechanism in QEMU to detect the connection of a display channel. Within
>>> QEMU, I then used qemu_spice_gl_monitor_config, and spice_gl_refresh to
>>> trigger a spice_gl_draw. This solution works, but the downside is that
>>> it requires changes to SPICE, QEMU, and especially the
>>> QXLInterface—which is obviously not ideal.
>>>
>>> Not ideal is a compliment. I would say complicated, hard to maintain,
>> adding too much coupling.
>>
>> So now I’m wondering: does anyone have a better idea for how to tackle
>>> this problem?
>>>
>>> I would define "the problem" first, currently you mentioned a race
>> condition without describing the details of the race.
>>
>>
>>> Best regards,
>>> Michael
>>>
>>
>> I could suspect the race is more in the current implementation of the
>> interface. Indeed that interface does not fit entirely in the Spice
>> server
>> model.
>>
>> Externally there are 2 functions, spice_qxl_gl_scanout and
>> spice_qxl_gl_draw_async, the callback async_complete is used to tell Qemu
>> when we finish with the scanout. So, spice_qxl_gl_scanout should set the
>> scanout (or frame if you prefer), while spice_qxl_gl_draw_async tells
>> Spice
>> to use the scanout, till async_complete is called (which should be
>> done in
>> a time fashion, I think Qemu timeout is 1 second). In theory the scanout
>> can be reused for multiple draws (which was never the case, but that's
>> another story). In theory a partial draw of the scanout can be requested.
>> In theory the scanout should not be used after async_complete is
>> called as
>> Qemu could reuse the scanout for next drawings. That last point is a
>> bit of
>> a problem here and to be honest something I think is an issue of the
>> external interface definition. In hardware you set the framebuffer and
>> the
>> video card will continue to use it, no matter what, the computer can
>> freeze
>> or panic and the video card will continue to use the same frame over and
>> over. Also, considering that the maximum that can happen is to get a
>> partial draw that will be fixed, I think it's correct to use the last
>> scanout to solve your initial problem.
>>
>> Internally Spice server stores the scanout in the RedQxl thread (Qemu I/O
>> one) but uses it in the RedWorker thread. This is pretty uncommon,
>> usually
>> data is passed from a thread to the other, ownership included. This,
>> probably, leads to the race you are facing. If that's the issue I think
>> really the best option is to fix that race.
>>
>> Regards,
>> Frediano
>>
>
More information about the Spice-devel
mailing list