monitoring the contents of the screen -- bug#14648

Mon Aug 25 18:54:12 PDT 2008

(Just realized that I originally sent this reply only to Keith; how
embarassing.)

On Sat, Aug 16, 2008 at 9:00 PM, Keith Packard <keithp at keithp.com> wrote:
> At the least, a precise specification of the desired behaviour, at the
> best, code that implements both server and useful client using the new
> behaviour.

Precise specification: damageproto.txt, in specifying DamageSubtract, says:
  196         Synchronously modifies the regions in the following manner:
  197
  198             If repair is None:
  199
  200                 1)      parts = damage
  201                 2)      damage = <empty>
  202
  203             Otherwise:
  204
  205                 1)      parts = damage INTERSECT repair
  206                 2)      damage = damage - parts
  207                 3)      Generate DamageNotify for remaining damage areas
My proposed semantics are: strike line 207 from the spec.

Implementing these semantics is also very simple:
 --- damageext.c.upstream
 +++ damageext.c
 @@ -274,2 +274 @@ ProcDamageSubtract (ClientPtr client)
 -       if (DamageSubtract (pDamage, pRepair))
 -           DamageExtReport (pDamage, DamageRegion (pDamage), (void
*) pDamageExt);
 +       DamageSubtract (pDamage, pRepair);
Life is more complicated, of course, if we maintain backwards
compatibility, but that should make clear what the idea is.  (It's not
*much* more complicated, the backwards-compatible version would just
be to add another mode that is exactly like
DamageReportDeltaRectangles but disables the above re-reporting of
damaged areas.)

The useful client code is pretty straightforward as well.  The
simplest case is where we have a client that simply reacts to each
damage rectangle immediately -- perhaps a screen recorder.
Pseudo-code in Python:
      def handle_damage_event(ev):
          XDamageSubtract(display, damage_id, ev.area, None)
          copy_window_rectangle_to_video_stream(window, ev.area)
This seems simple and correct, but with the current semantics has a
nasty result -- as we're processing one damage event, there may be a
bunch more in flight from the server that we haven't received yet (or
sitting in our event queue, same thing).  When we call XDamageSubtract
here, line 207 says that all those events get re-sent.  Now we have
two copies in flight towards us.  Then the next event we process
causes *another* flurry of re-notifications, so eventually there's a
~quadratic blowup in the number of notifications we receive.

The client that I'm personally interested in suffers even more; my
code watches the contents of windows and then ships those contents off
across a network connection (it's a remote application viewer).  So
one thing it has to deal with is network congestion; instead of
processing damage immediately, it uses a client-side GdkRegion to
track which parts of a window are currently damaged, and then whenever
its socket becomes writeable, it picks a dirty rectangle, calls
DamageSubtract, and then ships the rectangle's contents off down the
socket.  This means that, depending on network conditions, we're not
just dealing with whatever DamageNotify's are in-flight down the X
connection, we may have a huge number of unhandled damaged rectangles.
 I originally stumbled onto this bug when a user asked why it was
taking multiple minutes to update a single terminal window... Real
code:
  http://partiwm.org/browser/xpra/server.py?rev=b0059e8839b2e5978bc9f8f3eadb460c95979d6d#L134

> Leaving the current semantics untouched means we don't have
> to worry about compatibility though, even if the current mechanism are
> horribly broken.
>
> And, I'm not worried about impolitic usage here. The current
> specification and implementation must both stand on their merits, if
> they have none, we should rewrite the spec to guide developers towards
> more useful designs.

Right.  Obviously, I think that the current spec is broken or I
wouldn't be raising a fuss, but choosing the best way to fix it needs
a judgment call, and I'm probably not the one to make it.  I think the
two options are pretty clear -- either disabling re-notifications
unconditionally, or adding (a) new damage mode(s) that disable them.
And the relevant data I have is that I cannot think of any way to use
Damage that is correct in the presence of re-notifications but broken
otherwise -- maybe you can, I'm really curious -- and furthermore that
all of the actual damage-using code I can find would be helped rather
than hurt by this change.  What do we conclude from this?  You tell me
:-).

-- Nathaniel