[Spice-devel] [spice] Enable mm_time adjustments on startup

Fri May 3 10:12:49 UTC 2019

----- Original Message -----
> From: "Frediano Ziglio" <fziglio at redhat.com>
> To: "Francois Gouget" <fgouget at codeweavers.com>
> Cc: "Spice devel" <spice-devel at lists.freedesktop.org>
> Sent: Friday, 3 May, 2019 10:43:20 AM
> Subject: Re: [Spice-devel] [spice] Enable mm_time adjustments on startup
> 
> > On Wed, 17 Apr 2019, Snir Sheriber wrote:
> > [...]
> > > > The reason for that is that while a minimum 400 ms latency is fine when
> > > > playing a YouTube video [1], it's very annoying when the whole desktop
> > > > is being streamed, either through the streaming agent or through the
> > > > future Virgl remote access, because then it translates into a 400 ms
> > > 
> > > Are you working on something like that (remote virgl)?
> > 
> > Yes. See (though I need to update refresh these GitHub branches):
> > 
> > https://lists.freedesktop.org/archives/spice-devel/2019-January/047329.html
> > 
> > 
> > > Notice that currently there's small hacky patch on client to ignore
> > > latency
> > > when it's full
> > > screen streaming and there is no audio playback.
> > > 
> > > (d047b2fb7f5d492d6c49f589ba5ff862c6b115da)
> > 
> > Right. It never actually had any effect in my tests.
> > * When testing the fullscreen streaming in QEmu (i.e. when
> >   streaming_mode is true) I had to use the mjpeg encoding because there
> >   seems to be some conflict between QEmu and GStreamer on Debian. But
> >   the patch has no effect on the builtin mjpeg decoder because the
> >   latency in mjpeg_decoder_queue_frame() is to drop frames if it is
> >   negative. To determine when to display the frame it uses the frame's
> >   timestamp.
> > * The rest of the time I'm testing by running an OpenGL application in a
> >   regular session and so streaming_mode is false so the GstVideoOverlay
> >   is not used and thus the patch has no effect.
> > 
> >   That's because spice_gst_decoder_queue_frame() uses latency to drop
> >   frames if it is negative; and to determine the deadline for decoding
> >   the frame. When not using the video overlay the decoded frames then
> >   get queued until it is time to display them, according to their
> >   mm_time timestamp which is not impacted by latency.
> > 
> > That said I'm not sure ignoring the frames mm_time timestamp is a good
> > idea as it means the network jitter will translate directly into
> > framerate jitter (e.g. in OpenGL applications or in silent movies).
> > 
> > 
> > > > Other steps are:
> > > > * Reducing the default latency.
> > > 
> > > What will be the default? what will happen to late video frames?
> > 
> > Late frames will be dropped as has always been the case (at least in
> > the mjpeg case otherwise it's a bit more complex).
> > 
> 
> I think the terminology should really be changed.
> "Late frame": what does it mean? Ignore that we know the protocol and
> implementation it does not make sense, all frames are late, this is
> live streaming, there can be a "too late", not a "late", they are all late.
> We have an expectation and we define the late or not based on that
> expectation.
> 
> > As before the latency must be adjusted, and particularly, increased, as
> > required to avoid dropping frames. Of course when starting with a
> 
> Honestly, we receive a frame, this is the best and more updated screen
> we have. What do you want to do? Drop it, that's obvious. Are we really
> sure about that? Is that the reason why we "must" increase this "latency"?
> 
> "latency": this term is really confusing. Taking into account that we
> need to deal also with network latency but this is not it I find always
> really confusing. Can't we start using "delay" or something else?
> 
> What's also confusing is the computation. Usually when you have
> latency issues the more latency you have the worst it is.
> In client you compute a value and the _less_ it is the worst it is.
> Usually a negative latency is an indication that or you craft a
> time machine or something went wrong.
> 
> > huge latency like 400 ms handling the latency correctly is much less
> > important as it only become important when you have really a bad network
> > connection or a very long encoding times.
> > 
> > It's important to distinguish increasing and decreasing the latency.
> > 
> > Increasing the latency means a frame queued on the client will be
> > displayed even later. So the latency can be increased freely.
> > 
> 
> If I'm drawing with the mouse something on the screen and the
> latency is 10 minutes I don't really think that increase "freely"
> is really good.
> Also this potentially can lead to a queue of 10 minutes in the
> client.
> 
> > Decreasing the latency means a bunch of frames queued on the client may
> > suddenly become late, causing them to be dropped. So more care must be
> > taken.
> > 
> 
> Is the problem of "care to take" something not avoidable or is it
> something we create with our hands? I mean, I know that current
> implementation will drop frames which we don't want but is it
> not a workaround of a bad implementation?
> 

I think the drop in mjpeg was done to reduce the time spent decoding,
if the decoding was taking too much dropping frame helped. This was
surely easy to implement in a single threaded application and is not
expected to cause issues on a video (more frames are expected) but
I would instead drop only if there's a pending frame.

> I personally would propose to avoid all that complicated computations
> between server and client.
> If every frame and audio piece is timed with a consistent monotonic time
> (and current they ARE) they can be consumed after ENCODING + TRANSMISSION
> + DECODING time. Now, all the parts have skews so to avoid scattering
> usually you attempt to have a minimum (that is the 400ms).
> Now it could be pretty simple for the client to compute the ENCODING +
> TRANSMISSION + DECODING values, just piece server_mm_time - piece_mm_time!
> Now: currently the server_mm_time we have is skewed and fluctuating but
> won't be hard to get a more consistent one (and easier to update!).
> The easy formula to compute the "minimum" is obviously the maximum of
> the previous TIMINGs (where TIMING == ENCODING + TRANSMISSION + DECODING)
> however this does not take into account that you can have peaks that
> will slow down everything forever. You usually wants to ignore peaks and
> look at more local average. Also you should take into account that we are
> human and you don't need a precision or milliseconds (well, this is more
> true for video than for audio... but simply consider that if you want a
> precision in ms in video you need 1000 FPS!).
> 
> That said, back to the patch. Instead of just setting mm_time_enabled
> directly I would call reds_enable_mm_time, it's not in a hot path so
> the performances are not important, the main channel is just created
> so reds_send_mm_time will do nothing but you'll have a more consistent
> setting.

Oh, would also good to have this comment:

"Starting Xspice (so no Windows jingle or other being played on startup),
playing a WebGL demo, and then trying to figure out why the server's
latency adjustments had absolutely no effect on the client."

on the commit message, like

"We send mm_time adjustments to the client whenever there is no audio
playback. There is no audio playback on startup. Therefore
mm_time_enabled must be true on startup.

For instance to reproduce start Xspice (so no Windows jingle or other
being played on startup) and play a WebGL demo. You'll notice that
the server's latency adjustments had absolutely no effect on the client."

Frediano