[Spice-devel] Remote 3d support
Frediano Ziglio
fziglio at redhat.com
Wed Jul 13 10:20:09 UTC 2016
Some updates.
Yesterday I found a big cause of part of the lag. The client and
multimedia synchronization. After some video playing/game running pressing
Ctrl-Z to suspend Qemu you can see the client still playing for a while.
I checked my software to reduce bandwidth and was working correctly not sending
any more data after the set latency. But the client continued to play for couple
of seconds! This could be good if we are just watching a movie but as soon as we
get more interactive and want to have some feedback 2 seconds make working impossible.
So I changed the code of the client to remove any delay to try to sync and
I get this https://www.youtube.com/watch?v=D_DCs2sriu0. Quite good (unfortunately
there is no audio, this was quite out of sync).
Seems that the latency/bandwidth computation is not able to handle well the
current queued data causing the bandwidth detected to be reduced a lot (so video
quality decrease a lot) while the latency computed is so high that the client
use this big delay (I got some experiment were the lag was much more than 2
seconds!).
To make the video so good I had to force the bitrate in our gstreamer code.
Also the compressed frame size of this game are quite low.
About VAAPI, gstreamer and our code. It looks like our code is not able to reduce
the bitrate used by the encoder (I'm actually using H264 and Intel implementation
of vaapi). The result is that in some cases the frame rate is reduced to 3/4 fps.
I tried lot of parameters (like cabac and dct8x8) but had no luck. Sometimes
our code seems to deadlock (I had some chat with Francois some day ago and could
be due to the way buffers are produced by the encoder). Setting a different
rate-control for vaapih264enc seems to cause our code to fail (other rate control
settings should behave much better for limiting the bit rate).
Frediano
>
> Hi,
> some news on the patch and tests.
>
> The patch is still more or less as I send it last time
> (https://lists.freedesktop.org/archives/spice-devel/2016-July/030662.html).
>
> So the a bit of history.
> Time ago I started a branch with the idea to fed frames from Virgl to
> the old drawing path to see what would happen. Many reason to do this,
> one is to exercise the streaming path for this and also see with the
> refactory work this could be done easier.
> The intention wasn't a final patch for this (extracting texture is
> surely not a good idea if it can be avoided and is not clear if doing
> this long trip is the good way or if there are shorter path for instance
> injecting directly into streaming code).
> The branch got stuck for a while (kind of a month or two) as just
> extracting the raw frame was not as easy (and got lost in different
> stuff). By the way when I got back time later I found a way using DRM
> directly and was easy to insert the frames. Beside some memory issues
> (fixed) and some frame flipping (worked around) was working!
> Locally is working very well, surprisingly all is smooth and fast
> (I run everything in a laptop machine with an Intel card).
> Obviously once is more or less working you try to get a bit harder
> and more real world setup so... playing games with even some network
> restriction (after some thinking I think this is one of the worst
> cases you can imagine that is if this works fine you are not far from
> a release!).
>
> Here of course problems started.
>
> Simulation
> To simulate some more real network case I used a program which
> "slow down sockets" forwarding data (I used Linux traffic shaping but
> this cause some problems). I knew this is not optimal (for instance
> queues and rtt detection from program are quite impossible) so I
> decided to use tun/tap (I tried to avoid having to use root to do such
> tests) and the final version (https://cgit.freedesktop.org/~fziglio/latency)
> is working really well (I just did some more tuning on CPU scheduling
> and the program is using just 2/3% of CPU so should not change tests
> that much).
>
> Latency
> One of the first issue of introducing a real network in the path was
> latency. Especially playing you can feel a very long lag (kind of
> seconds even if the stream is quite fast). At the end I'm using xterm
> and wireshark to measure the delay. The reason is that xterm cursor does
> not blink and does very few screen operations so in wireshark you
> can see a single DRAW_COPY operation and as this change is quite small
> you can also feel the delay without using wireshark. This test is quite
> reliable and the simulator behave very fine (also a real network).
> I usually use h264 for encoding. Using normal stream configuration
> the lag is much lower (also the video quality) but even if the video
> is fluid the delay is higher than xterm. I put some debugging on the
> frames trying to introduce delays for encoding and extraction and
> usualy a frame is processed in 5 ms (since Qemu call) so I don't
> understand where the lag came. Could be some options of the encoders,
> the encoding buffer is too large (the network one isn't) or some problems
> with gstreamer interaction (server/gstreamer-encoder.c file).
> Trying to use vaapi the lag is getting much worse, even combined with very
> large bandwidth, however the behaviour of gstreamer vaapi is quite different
> and the options are also much different. Maybe there are options to
> improve compression/delay, maybe some detail in the plugin introduce
> other delays. For sure the vaapi h264 has bitrate which cannot be changed
> dynamically so this could be an issue. The results is that quality is
> much better but frame rate and delay is terrible. Also while using x264
> encoder (software one) the network queue (you can see using netstat)
> is quite low (kind of 20-80kb) with low bandwidth while with vaapi
> is always too high (kind of 1-3mb) which obviously do not help with
> latency.
>
> Bandwidth
> Obviously an high bandwidth helps. But I can say that x264 encoder
> do quite a good job when the bandwidth is not enough. On the opposite
> it get quite some time (kind of 10-20 minutes) to understand that
> bandwidth got better. vaapi was mainly not working.
> Sometimes using a real wifi connection (with a cheap and old router)
> you can see bandwidth get down for a while, probably some packet
> lost and retransmission kick in).
>
> CPU usage
> Running all in a single machine without helping in encoding decoding
> made this problem quite difficult you end up using all CPU power and
> even more turning kernel schedule in the equation. Sometimes I try
> using another machine as client so I can see more clearly where the CPU
> is used to support a virtual machine.
>
> Qemu
> There is still an hack to support listening to tcp instead of unix sockets,
> will be changed with spice-server changes.
> Turns out that for every frame a monitor_config is sent. Due to the
> implementation of spice-server this is not helping improving the latency.
> I merge my cork branch and did some changes in spice-server and you can
> get some good improvement.
> Got a patch from Marc-andre for remove a timer which is causing lot
> of cpu usage on RedWorker, still to try.
> The VM with Virgl is not powering off, didn't investigate.
>
>
> In the end lot of small issues and stuff to investigate, I don't have
> a clear idea on how to progress. My last though is avoid vaapi for
> a while and fix some small issues (like monitor_config and trying to
> understand additional lag when stream is using). vaapi state and
> gstreamer to full implement offloading of the encoding has too
> variables (our gstreamer code, options, pipeline to use, code
> stability, card support).
> gstreamer and texture data extraction (a fallback we should have)
> seems to work better with GL stuff so possibly having Qemu communicate
> some EGL setup will be required (that is ABI change between Qemu and
> spice-server).
> Maybe EGL extraction, data extraction lazyness (to avoid expensive
> data copy if frames are dropped) could be a possible step
> stable enough to have some code merged.
>
More information about the Spice-devel
mailing list