[PATCH xextproto 1/4] Document changes in XSync version 3.1

Wed Aug 18 12:36:58 PDT 2010

On Wednesday 18 August 2010 8:33:49 am Adam Jackson wrote:
> * PGP Signed by an unknown key
> 
> On Tue, 2010-08-17 at 15:54 -0700, James Jones wrote:
> > On Tuesday 17 August 2010 15:01:53 Aaron Plattner wrote:
> > > James, correct me if I'm wrong, but I don't think the server ever
> > > creates fences on its own; they're *all* client-created.  Fences
> > > trigger when the rendering for the corresponding X screen is done, for
> > > requests that were processed before the triggering request.  Rendering
> > > could still be pending on other screens, and there could be later
> > > rendering queued on the same screen for later requests.

Adam and I discussed on IRC.  I'll try to summarize a bit below for the list:

> So, a few things here:
> 
> One: Why the screen and not the drawable?  If an X screen is a single
> command queue to the hardware with no special ordering (besides
> "whatever order clients happened to be serviced") then it doesn't really
> matter.  But if you have multiple queues - or schedule rendering among
> clients into one queue for fairness - then you now need to keep track of
> actual request processing order among _all_ clients with active Fences.

The screen is somewhat arbitrarily chosen as the granularity at which fence 
triggers are inserted.  It seemed convenient because fences already have a 
screen associated with them at creation time, mainly for server/driver design 
reasons, so no additional associating object is needed at trigger time.  If 
others think there is a large benefit to adding a drawable argument to trigger 
and specifying that it only means rendering targetting that drawable is 
complete, I can modify the spec accordingly.  However, keep in mind other 
trigger operations can be added with different granularity.  
DamageSubtractAndTrigger, for example, is specified to trigger on region 
granularity.  I still think screen is the right granularity for the default 
operation.  If there are multiple rendering queues per screen, that means 
waiting for all of them to finish pending work before triggering, but that 
shouldn't be especially complicated

The important detail to note is that even though fences belong to a screen, 
they can be triggered at any time.  It is the individual triggering operations 
that define when the fence becomes triggered.

> Two: I'm not clear what the mechanism is for asking that the server
> trigger a Fence once some X rendering is complete.  You have Trigger and
> Reset operations, but if I'm reading those correctly those are commands
> from the client and not stalls until a condition is reached.
> 
> In GL_NV_fence terms, I think you have:
> 
> GL                X
> GenFences         CreateFence
> DeleteFences      DestroyFence
> SetFence          ???
> N/A               TriggerFence

This is incorrect.  The corrected table should read:

...
SetFence            TriggerFence
N/A                 ResetFence
...

> N/A               ResetFence
> TestFence         QueryFence
> FinishFence       AwaitFence
>
> It's okay to not have analogs for Trigger/Reset in GL, because GL fences
> are scoped to the GL context; you couldn't use them for IPC if you
> wanted to.  But without an analog for SetFence in X, you don't have a
> way of getting the server to bind a fence to a particular point in the X
> rendering stream.
>
> But maybe you're going for something more like GL_ARB_sync?  Help me out
> here.

X sync fences are meant to interoperate with GL_ARB_sync-style fences.  
However, in contrast to GL sync objects, I've separated out the creation and 
trigger steps.  I did this because in contrast to GL sync objects, sharing X 
fences is expensive.  While a GL fences are essentially pointers with process 
scope, X fence sync objects are cross-process X resources with X screen scope 
that can be imported, along with their hardware resources, to other APIs in 
any process.  Since X fence sync objects are meant to be re-used, in contrast 
to GL sync objects which are single-use, I added the reset operation as well.

> Three: XSyncSetFence could nicely resolve issue one.  Fences would then
> have basically the same scoping rules as GCs; they're per-screen, but
> they can only be used on one drawable at a time.  From X's perspective
> though, you wouldn't necessarily need them to be per-screen at creation
> time; ProcSyncSetFence would do both pScreen->RealizeFence() and
> pScreen->SetFence().  But I can see why you might want that binding
> early for GL.

You're correct.  Creation and especially GL import can be expensive.  That's 
what drove the design.

> Four: The Xinerama interaction here is potentially subtle.  There's only
> one protocol screen in Xinerama, so merely creating the Fence relative
> to a screen doesn't tell you which GPU you're bound to.  You'd probably
> need to do the standard trick of blasting the Fence across all
> ScreenRecs, and then Await would wait for completion across them all.
> Potentially inefficient, but that's par for the Xinerama course.

Yeah, it's an important detail, but not especially complicated.  Xinerama is 
what it is.  Fences are no different than a rendering operation in that 
regard.

> > I'll make another pass and try to capture more of the above in the spec,
> > in addition to making more intrusive updates to the sections you
> > reference above. I'll send those out when I send out the xserver
> > changes.  I tried a few directions with the server code, and feel the
> > current spec/implementation fell out most naturally from the existing
> > architecture, so having the code to look at side-by-side with the proto
> > patches might help.  If anyone wants to get ahead of me, the latest
> > xserver changes are available on my github repos at
> > 
> > http://github.com/cubanismo/xserver/tree/fence_sync
> 
> Nice, I'll give that a read.
> 
> > > The idea is that a client creates a Fence on a given X screen, binds it
> > > to an OpenGL sync object using a to-be-created GLX extension, then in
> > > response to an XDamageNotify, sends DamageSubtractAndTrigger, tells
> > > OpenGL to wait for the corresponding OpenGL side of the fence, and
> > > then performs the rendering using OpenGL.  This makes the GL wait (on
> > > the GPU, ideally) for the X rendering on that screen to finish without
> > > making the client itself wait on the CPU.
> > 
> > See the damageproto changes for a bit more background, but they basically
> > just say what Aaron said above.
> 
> This sounds like you're designing it just for the compositor to consume.
> I'm not really sure yet whether that's a reasonable thing.
> 
> I kind of want to eliminate the SubtractAndTrigger there, it seems like
> you should be able to go straight from a FenceNotify (which, admittedly,
> your proposal does not have) to the glFinishFence.  To elaborate:
> intelligent clients would have a window property naming their Fence, and
> would XSyncSetFence when they post a new frame; the compositor would get
> an AwaitNotify for that and then glFinishFence as usual.  (The existing
> frame sync protocol is not far off from this as it is.)
> 
> Maybe I'm overthinking this.

This is my thinking as well, but fence objects are still a necessary 
precursor.  SubtractAndTrigger is just for dumb/legacy X applications.  
New/Smart apps will use something similar to what you describe above.  Again, 
see these slides from last year for the long-term, high-level goals I'm 
working towards:

http://people.freedesktop.org/~aplattner/x-presentation-and-
synchronization.pdf

In the end, it will be possible (though not a requirement) to push all 
synchronization down to the GPU/rendering backend without any CPU waits.

Thanks,
-James

> - ajax
> 
> * Unknown Key
> * 0xA0ECD0D3