[EXTERN] Re: High delay of video-streams

Michael Scherle michael.scherle at rz.uni-freiburg.de
Tue Apr 2 14:27:40 UTC 2024


Hi Frediano,

thank you very much for your detailed answer.


On 02.04.24 14:13, Frediano Ziglio wrote:

> Really short explanation: Lipsync.
> 
> Less cryptic explanation: video streaming was added much time ago when
> desktops used 2D graphic drawings, like lines, fillings, strings and
> so on. At that time networks were more unreliable, latency bigger, and
> with high probability a continuous bitblt on the same big area was a
> video playing. So the idea of detecting the video playing and
> optimizing to sync audio and video was a good idea.

ok this explains a lot.

> Now starts my opinionated ideas. The idea of continuous bitblt being
> only a video stream is wrong, nowadays desktops do use large bitblt
> for everything, or better they use 3D cards a lot and compose the
> various windows on the screen which appears to us as just bitblt,
> often contiguous. So the delay should just be removed optimizing for
> real time video streaming. As you realize the algorithm also keeps
> increasing the delay for every glitch found which is not improving the
> user experience. I have different changesets removing entirely all
> these delays (it's possible to get this just by changing the server
> part), the result is much less delay, the audio/video sync (watching a
> movie) is, with nowadays networks, acceptable.


Would it be possible to get your changesets, so that I could try them 
out? I would be interested to know how this can be implemented with only 
server-side changes. A dirty idea I had (and tried) would be to set the 
mm_time to the past so that the client displays the image immediately, 
but that would not be a good fix in my opinion.

I would rather consider it reasonable that the server timestamps the 
frames (and perhaps the sound) with the encoding time and that the 
client itself calculates when it wants to display them (from the diffs). 
So the client could decide if it wants to display the images directly or 
add some delay to compensate for network jitter (or lipsync) or maybe 
even implement something like v-sync. These would of course be breaking 
changes that would require changes to the client and server and would 
make them incompatible with older versions. If this could not be done 
directly, due to compatibility reasons, maybe this could be implemented 
in a separate low latency mode or something like that (which both server 
and client needs to support).

Even with above ideas applied, for spice-gtk, I have noticed a high 
decode delay. The gstreamer pipeline always seems to keep at least 2 
frames in the pipeline (regardless of the frame rate) which increases 
the delay further. Have you also noticed this? I'm currently looking 
into the reason for this.

When testing stuff out we saw that Sunshine/Moonlight performed very 
well in generating  a low delay and high QoE. That is kind of our 
benchmark for remote access to strive for :)

Greetings
Michael

>>
>> On 15.03.24 14:08, Michael Scherle wrote:
>>> Hello spice developers,
>>>
>>> we are trying to develop an Open Source virtual desktop infrastructure
>>> to be deployed at multiple German universities as described, by my
>>> colleagues, in the paper which I have put in the attachment. The
>>> solution based on openstack, qemu, spice... Our plan is also to have VM
>>> instances with virtual GPUs (SR-IOV). Due to the resulting requirements,
>>> it is necessary to transmit the image data as a video stream.
>>> We have seen Vivek Kasireddy recent work on spice which solves exactly
>>> this problem. However, when we tested it, we noticed a very high input
>>> to display delay (400 ms+ but only if the image data is transferred as
>>> video-stream). However, the problem seems to be a more general spice
>>> problem or is there something wrong with our setup or are there special
>>> parameters that we are missing?
>>>
>>> Our setup:
>>>
>>> QEMU: https://gitlab.freedesktop.org/Vivek/qemu/-/commits/spice_gl_on_v2
>>> Spice:
>>> https://gitlab.freedesktop.org/Vivek/spice/-/commits/encode_dmabuf_v6
>>> virt-viewer
>>> Intel HW decoder/encoder (but same with sw)
>>>
>>> I have looked into what is causing the delay and have noticed that
>>> encoding only takes about 3-4ms. In general, the image seems to reach
>>> the client in less than 15ms.
>>> The main problem seems to be that gstreamer gets a very high
>>> margin(https://gitlab.freedesktop.org/spice/spice-gtk/-/blob/master/src/channel-display.c?ref_type=heads#L1773) and therefore waits a long time before starting decoding. And the reason for the high margin seems to be the bad mm_time_offset https://gitlab.freedesktop.org/spice/spice-gtk/-/blob/master/src/spice-session.c?ref_type=heads#L2418 which is used to offset the server time to the client time (with some margin). And this variable is set by the spice server to initially 400 ms https://gitlab.freedesktop.org/spice/spice/-/blob/master/server/reds.cpp?ref_type=heads#L3062 and gets updated with the latency https://gitlab.freedesktop.org/spice/spice/-/blob/master/server/reds.cpp?ref_type=heads#L2614 (but only increased). I still need to see how this latency is calculated.
>>>
>>> Am I missing something or is this design not intended for transmitting
>>> interactive content via video stream?
>>> Temporarily overwriting the margin and tweaking parameter settings on
>>> the msdkh264dec brought the delay to about 80-100ms, which is not yet
>>> optimal but usable. To see what is technical possible on my setup, I
>>> made a comparison using moonlight/sunshine which resulted in an delay of
>>> 20-40ms.
>>>
>>> Our goal is to achieve some round trip time similar to the
>>> moonlight/sunshine scenario to achieve a properly usable desktop
>>> experience.
>>>
>>> Greetings
>>> Michael
>>
>> Greetings
>> Michael


More information about the Spice-devel mailing list