why not flow control in wl_connection_flush?

Fri Feb 23 10:12:38 UTC 2024

On Thu, 22 Feb 2024 10:26:00 -0500
jleivent <jleivent at comcast.net> wrote:

> Thanks for this response.  I am considering adding unbounded buffering
> to my Wayland middleware project, and wanted to consider the flow
> control options first.  Walking through the reasonsing here is very
> helpful.  I didn't know that there was a built-in expectation that
> clients would do some of their own flow control.  I was also operating
> under the assumption that blocking flushes from the compositor to
> one client would not have an impact on other clients (was assuming an
> appropriate threading model in compositors).

I would think it to be quite difficult for a compositor to dedicate a
whole thread for each client. There would be times like repainting an
output, where you would need to freeze or snapshot all relevant
surfaces' state, blocking the handling of client requests. I'm sure it
could be done, but due to the complexity a highly threaded design would
cause, most compositors to my perception have just opted for an
approximately single-threaded design, maybe not least because Wayland
explicitly aims to support that. Wayland requests are intended to be
very fast to serve, so threading them per client should not be
necessary.

> The client OOM issue, though: A malicious client can do all kinds of
> things to try to get DoS, and moving towards OOM would accomplish that
> as well on systems with sufficient speed disadvantages for thrashing.
> A buggy client that isn't trying to do anything malicious, but is
> trapped in a send loop, that would be a case where causing it to wait
> might be better than allowing it to move towards OOM (and thrash).

Where do you draw the line between being "stuck in a loop", "doing something
stupid but still workable and legit", and "doing what it simply needs
to do"?

One example of an innocent client overflowing its sending is Xwayland
where X11 clients cause wl_surface damage in an extremely fragmented
way. It might result in thousands of tiny damage rectangles, and it
could happen in multiple X11 windows simultaneously. If all that damage
was relayed as-is, it is very possible the Wayland socket would
overflow. (To work around that, there is a limit in Xwayland on how
many rects it is willing forward, and when that is exceeded, it falls
back to a single bounding-box of damage, IIRC.)

Blocking might be ok for Xwayland, perhaps, so not the best example in
that sense.

A client could also be trapped in an unthrottled repaint loop, where it
allocates pixel buffers without a limit because a compositor is not
releasing them as fast, and the general idea is that if you need to
draw and you don't have a free pixel buffer, you allocate a new one.
It's up to the client to limit itself to a reasonable number of pixel
buffers per surface, and that number is not 2. It's probably 4 usually.
A reasonable number could be even more, depending.

Thanks,
pq

> On Thu, 22 Feb 2024 11:52:28 +0200
> Pekka Paalanen <pekka.paalanen at haloniitty.fi> wrote:
> 
> > On Wed, 21 Feb 2024 11:08:02 -0500
> > jleivent <jleivent at comcast.net> wrote:
> >   
> > > Not completely blocking makes sense for the compositor, but why not
> > > block the client?    
> > 
> > Blocking in clients is indeed less of a problem, but:
> > 
> > - Clients do not usually have requests they *have to* send to the
> >   compositor even if the compositor is not responding timely, unlike
> >   input events that compositors have; a client can spam surfaces all
> > it wants, but it is just throwing work away if it does it faster than
> >   the screen can update. So there is some built-in expectation that
> >   clients control their sending.
> > 
> > - I think the Wayland design wants to give full asynchronicity for
> >   clients as well, never blocking them unless they explicitly choose
> > to wait for an event. A client might have semi-real-time
> >   responsibilities as well.
> > 
> > - A client's send buffer could be infinite. If a client chooses to
> > send requests so fast it hits OOM, it is just DoS'ing itself.
> >   
> > > For the compositor, wouldn't a timeout in the sendmsg make sense?    
> > 
> > That would make both problems: slight blocking multiplied by number of
> > (stalled) clients, and overflows. That could lead to jittery user
> > experience while not eliminating the overflow problem.
> > 
> > 
> > Thanks,
> > pq
> >   

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20240223/fb9e17e9/attachment-0001.sig>