[Spice-devel] [PATCH 8/9] worker: make sure we dispatch after releasing items

Mon Dec 21 11:48:10 PST 2015

On Mon, 2015-12-21 at 09:54 -0600, Jonathon Jongsma wrote:
> On Mon, 2015-12-21 at 10:17 -0500, Frediano Ziglio wrote:
> > > 
> > > On Fri, 2015-12-18 at 06:10 -0500, Frediano Ziglio wrote:
> > > > > Hi
> > > > > 
> > > > > ----- Original Message -----
> > > > > > On Wed, 2015-12-16 at 18:16 -0500, Marc-André Lureau wrote:
> > > > > > > Hi
> > > > > > > 
> > > > > > > ----- Original Message -----
> > > > > > > I actually think that was it. The worker thread may currently
> > > > > > > block
> > > > > > > because
> > > > > > > the client is too slow. But then, when we release item, the pipe
> > > > > > > is
> > > > > > > going
> > > > > > > to
> > > > > > > be smaller hopefully, and thus we can try to dispatch again
> > > > > > > without
> > > > > > > waiting
> > > > > > > for the timeout. (waiting for timers can really degrade
> > > > > > > performances
> > > > > > > in
> > > > > > > some
> > > > > > > cases, like you see today with the qemu virgl timer, but that's
> > > > > > > another
> > > > > > > story where I can show you figures when we merge gl scanounts in
> > > > > > > spice
> > > > > > > ;)
> > > > > > 
> > > > > > Then I'm still confused. In the old loop, we had a very explicit
> > > > > > sequence
> > > > > > of
> > > > > > behavior. For each loop:
> > > > > > 
> > > > > >  1 calculate timeout for poll() function
> > > > > >  2 poll() fds with timeout from step 1
> > > > > >  3 service expired timers
> > > > > >  4 handle fd watch events
> > > > > >  5 process cursor commands from guest
> > > > > >  6 process display commands from guest
> > > > > >  7 push queued messages to clients
> > > > > >  8 GOTO 1
> > > > > > 
> > > > > > After pushing the queued messages to the client, we did not go back
> > > > > > and
> > > > > > handle
> > > > > > cursor and display commands again. We went back to the top of the
> > > > > > loop
> > > > > > and
> > > > > > did a
> > > > > > poll with timeout again. So I fail to see why the glib loop needs to
> > > > > > short
> > > > > > -circuit the timeout and return to step 5 when the old design
> > > > > > didn't.
> > > > > > 
> > > > > > Or is this change unrelated to the new glib loop and the worker has
> > > > > > simply
> > > > > > always been too slow even with the old loop implementation?
> > > > > > 
> > > > > 
> > > > > I wouldn't say "too slow", and I wouldn't try to mimic the old code
> > > > > either.
> > > > > For me the goal is simply to process the pipe/command as fast as
> > > > > possible
> > > > > and if possible, get rid of timeouts.
> > > > 
> > > > I can understand that not mimic can be better but at the moment this is
> > > > causing
> > > > a bit of regressions and this patch was trying to avoid some (I think).
> > > > 
> > > > I looked more deeply at glib code. The sources are stored (appended) in
> > > > multiple
> > > > list (each for every priority and given at the moment we are using a
> > > > single
> > > > priority which was similar to former code anyway). Considering that the
> > > > processing
> > > > code is registered at worker initialization and that watches are
> > > > recreated
> > > > when
> > > > event masks are updated this possibly lead the watched to be after the
> > > > process
> > > > leading the loop to be:
> > > > 
> > > >  1 calculate timeout for poll() function
> > > >  2 poll() fds with timeout from step 1
> > > >  3 process cursor commands from guest
> > > >  4 process display commands from guest
> > > >  5 push queued messages to clients
> > > >  6 service expired timers
> > > >  7 handle fd watch events
> > > >  8 GOTO 1
> > > > (not sure about timers but are not that frequent).
> > > > This obviously make stuff worse as after step 7 you have to wait the
> > > > poll
> > > > (2)
> > > > to get commands processed again.
> > > > I'm not saying that the patch address also some other issues but that
> > > > the
> > > > glib
> > > > code
> > > > actually make this patch more useful (that is glib have regressions).
> > > > 
> > > > On a more wide looking. What are this (and possibly "worker: avoid
> > > > server
> > > > hanging
> > > > when no client are connected.") patch trying to fix?
> > > > Basically trying to handle slow clients. On fast clients you don't have
> > > > many
> > > > issues.
> > > > Pipe queue is (or should be) quite small and all commands are processed.
> > > > But
> > > > on slow
> > > > clients we process till the queue reach a given size. I honestly don't
> > > > have
> > > > all
> > > > knowledge on what happen next. When you do a push on RedChannel you try
> > > > to
> > > > send as
> > > > many items till network queue is full. If the client is just not
> > > > processing
> > > > you can
> > > > only add items to the queue. After a while the limit is reached and to
> > > > avoid
> > > > queuing
> > > > indefinitely you stop processing commands. Now I would have some
> > > > questions:
> > > > - what happen in the guest? Just after a while the screen get no more
> > > > updates?
> > > >   And if the client ask for an update?
> > > > - does item collapsing occurs only if client is not connected? I think
> > > > that
> > > > what should
> > > >   happen is that if there are too items items should be collapsed
> > > > (reduced
> > > > drawing
> > > >   on surfaces) and processing should continue.
> > > > - what happen if client keep sending request that trigger appending
> > > > items
> > > > but
> > > > avoid
> > > >   the read? Can a DoS be triggered?
> > > > 
> > > > Frediano
> > > 
> > > 
> > > 
> > > So I did some very simplistic testing here. Basically I looked at the
> > > red_process_display() function to see how often we encountered a full pipe
> > > queue
> > > (since that is the situation that Marc-Andre thinks this patch is trying
> > > to
> > > improve). I was curious how often this happened, so I just instrumented
> > > the
> > > code
> > > to count how many times we called red_process_display() and how many of
> > > those
> > > times we returned from this function due to a full queue (rather than
> > > because
> > > there were no more commands to process). Here's the patch:
> > > 
> > > diff --git a/server/red-worker.c b/server/red-worker.c
> > > index dfaf4ba..dec635d 100644
> > > --- a/server/red-worker.c
> > > +++ b/server/red-worker.c
> > > @@ -229,7 +235,10 @@ static int red_process_display(RedWorker *worker,
> > > uint32_t
> > > max_pipe_size, int *r
> > >      QXLCommandExt ext_cmd;
> > >      int n = 0;
> > >      uint64_t start = spice_get_monotonic_time_ns();
> > > +    static int total = 0;
> > > +    static int full = 0;
> > >  
> > > +    total++;
> > >      if (!worker->running) {
> > >          *ring_is_empty = TRUE;
> > >          return n;
> > > @@ -335,6 +344,9 @@ static int red_process_display(RedWorker *worker,
> > > uint32_t
> > > max_pipe_size, int *r
> > >              return n;
> > >          }
> > >      }
> > > +    full++;
> > > +    fprintf(stderr, "display channel exceeded max pipe size: %i/%i
> > > (%g)\n",
> > > +            full, total, (double)full / total * 100.0);
> > >      return n;
> > >  }
> > >  
> > > 
> > > This was my procedure:
> > > 
> > > - Boot RHEL7 guest
> > > - wait for boot sequence to finish
> > > - log into GNOME
> > > - start web browser, open a page
> > > - open another page in another tab
> > > - open video player and play video in full screen
> > > - enable second display
> > > - resize second display
> > > - disable second display
> > > - exit video player
> > > - shut down guest
> > > 
> > > I did this same test with the current master branch, with the glib loop
> > > patch
> > > applied, and also with this patch on top of the glib loop patch.
> > > 
> > > What I noticed is that during the boot sequence and the shutdown sequence
> > > (when
> > > we're in a VT and text is scrolling up the screen), we hit the queue limit
> > > quite
> > > frequently. In fact, nearly 30% of the time that we call
> > > red_process_display(),
> > > we return because our pipe queue is full. However, after X has started and
> > > we're
> > > in the desktop, we never hit a full-queue condition. When I shut down the
> > > guest,
> > > it briefly returns to the VT until it powers down, and we again hit some
> > > full
> > > -queue conditions.
> > > 
> > > This was true of both the old loop and the new glib loop (with and without
> > > the
> > > patch).
> > > 
> > > There were some differences between the old and new loop. In particular,
> > > the
> > > new
> > > glib loop called red_process_display() more times in roughly the same
> > > elapsed
> > > time. It also had a higher percentage of calls that returned because the
> > > queue
> > > was full (but again, this did not happen during "normal" operation, but
> > > only
> > > during the boot and shutdown phase). The glib loop with this patch on top
> > > had
> > > roughly the same percentage as the glib loop on its own, but the total
> > > number
> > > of
> > > calls to red_process_display() was higher, unsurprisingly.
> > > 
> > > 
> > > Here is some example output:
> > > 
> > > Old loop: http://ur1.ca/ocnyn
> > > Glib loop: http://ur1.ca/ocnyp
> > > glib loop with this patch applied: http://ur1.ca/oco1u
> > > 
> > > Notice that lines such as this indicate that GDM has started or I've
> > > logged
> > > into
> > > GNOME:
> > > 
> > >    main_channel_handle_parsed: agent start
> > > 
> > > Then I use the desktop for about a minute before shutting down again. No
> > > more
> > > log messages are printed until shutdown begins.
> > > 
> > >  I also did another test where I used the guest "normally" (browsing
> > >  internet,
> > > watching videos, writing text in libreoffice, etc) for quite a while, and
> > > still
> > > never triggered the full-queue condition.
> > > 
> > > I'm not sure what conclusions to draw from this exactly. There is probably
> > > some
> > > better way to test this. Perhaps we should test how long a pipe item stays
> > > in
> > > the queue before it is sent.
> > > 
> > > Jonathon
> > > 
> > 
> > That's great!
> > 
> > I'm quite scare about the loop counts using Glib. Mainly looks like loop is
> > executed twice as much (with possible cpu usage increased).
> 
> Yeah, with the plain glib loop it's about 1.5x, but with the additional wakeup
> -after-releasing-a-pipe-item patch, it's closer to double.
> 
> 
> > 
> > I agree would be helpful to see the latency of the items.
> > 
> > Did you try increasing client latency or decreasing bandwidth?
> > I think replay utility could be useful for these tests (didn't try).
> > 
> > Frediano
> 
> 
> 
> Nope, I didn't try any latency or bandwidth adjustments yet. This was just a
> first step to get a rough comparison and then figure out exactly what we
> should
> be testing. I haven't actually used the reply utility much yet, so I may have
> some questions about it when I start using it. But that's a good suggestion.
> 
> Jonathon

OK, here's some initial test results showing the amount of time a pipe item
waits in the queue before being sent out. Basically I save the time when a pipe
item is is added to the queue, and then check the time when that item is sent to
the client. The procedure I used to test is basically the same as that describe
above for my previous test. I 

Old loop:
========
- First run:
    avg=1.64109ms, max=56.881ms, min=5.4e-05ms (n=85801)

- second run (with queue time distribution):
    avg=1.66152ms, max=60.9958ms, min=4.1e-05ms (n=84801)    queue time distribution: <0.1ms=3495, <0.2ms=509, <0.5ms=7035,
    <1ms=36476, <2ms=26316, <5ms=7399, <10ms=1285, <20ms=1365, <50ms=914, >50ms=7

glib loop:
=========
- First run:
    avg=1.78949ms, max=49.901ms, min=4.8e-05ms (n=85801)

- second run (with queue time distribution):
    avg=1.67802ms, max=65.0128ms, min=3.9e-05ms (n=87601)
    queue time distribution: <0.1ms=4313, <0.2ms=500, <0.5ms=6088, <1ms=35567,
    <2ms=29102, <5ms=8284, <10ms=1502, <20ms=1224, <50ms=1012, >50ms=9

glib loop with this patch applied:
=================================
- First run:
    avg=1.76456ms, max=65.865ms, min=4.8e-05ms (n=86601)

- second run (with queue time distribution):
    avg=1.80822ms, max=53.4864ms, min=5.5e-05ms (n=86001)
    queue time distribution: <0.1ms=3393, <0.2ms=500, <0.5ms=6166, <1ms=31334,
    <2ms=30842, <5ms=9671, <10ms=1636, <20ms=1414, <50ms=1037, >50ms=8

As you can see, the stats vary a bit between runs. It's not obviously better or
worse between the old loop and the new glib loop. The average queue times are
just slightly higher on the glib ones, but I'd have to do more test runs to know
whether that was statistically significant. 

One thing I did notice was that during the (VT-based) boot phase, the max queue
time tended to be only about 20ms, but once the graphical environment started
up, I started getting more longer queue times. This could just be due to the
increased CPU load of the host, though.

Also worth noting is that if the queue is full, we simply don't create pipe
items. So, it's possible that qxl commands are waiting longer to be processed,
but that doesn't show up on these statistics. 

Somewhat unrelated question: Does anybody know how the value of 50 was chosen
for MAX_PIPE_SIZE? If we could increase that value a bit, it would reduce the
number of times we'd hit the full queue scenario. Of course it would also
increase memory use, but maybe we're willing to trade some additional memory
usage in the host to avoid blocking the guest regularly due to unhandled qxl
commands.

(instrumentation patch here if you're intersted: http://ur1.ca/ocsfl)