Updates to GLX_EXT_texture_from_pixmap

Thu Mar 23 12:34:06 PST 2006

On Thursday 23 March 2006 10:58 am, Deron Johnson wrote:
> James Jones wrote On 03/22/06 16:13,:
> > Hi Deron,
> >
> > The problem here is, how do you block the rendering?  It would
> > be nice if we could put clients who wanted to render to a
> > particular drawable to sleep.  However, this would be very hard
> > to do.  The current dispatch mechanism isn't capable of this.
>
> I don't know the details of your driver implementation but in the
> drivers I have worked with it would not be all that hard to block
> the rendering process. First of all, it's easy to block X core
> protocol and GLX rendering clients. For direct clients, they
> are required to acquire a lock to access device resources (such
> as direct access to the screen or DMA buffers). You just hold
> off granting them the lock if their current drawable is bound.
> Even if they've already grabbed a DMA buffer, don't let it be
> posted to the hardware command buffer until the drawable is
> unbound. Again, I don't know the details of your driver
> implementation, but there usually is a way to do put rendering
> clients to sleep.

The problems I'm worried about are not specific to our 
implementation, nor to direct rendering.  I'm more worried about 
good old fashioned core X rendering.  It is easy to block out 
clients unconditionally.  If a display connection grabs the server, 
just remove all other file descriptors from your poll list until 
that client ungrabs.  It is my understanding that it would be much 
harder to block out clients on a per-operation + per-drawable 
basis.  This requires the server continue accepting commands from 
all clients, parsing them, then potentially saving off that 
operation if the client wishes to:

a) perform rendering
b) perform that rendering on particular drawables

Then, stop accepting commands from that client until rendering to 
that drawable is allowed again.

If you think this functionality would be easy to add to the X 
server, I fully support it's addition as a separate X extension.

Perhaps you even know of a courser criteria we could use that would 
be easier to implement than the one I suggest above.  Either way, 
we could of course support this with direct rendering as you 
describe.  X is the hard part here.

> Andy seemed to think that there was a way to do the blocking
> in our discusssion at the X developers conference. What changed?

Essentially, we came back, discussed the conference with our group, 
had a meeting about it, and came up with the concerns I've laid out 
here.  Even after the breakout at the developers conference I 
discussed some of these concerns with Adam Jackson and David 
Reveman briefly, and they conceded that while the language "copy on 
write semantics" had been used in the discussion, the X server 
should not be required to block out rendering to specific 
drawables.  I came away with the thought that I was being slightly 
paranoid and everyone else had this looser understanding of the 
conclusion.  Perhaps I was doubly mistaken there :-)

> It may be tricky, it may even be difficult, but it has to be
> achieved. Otherwise you end up implementing an extension that no
> composite manager can reasonably use.

It seems odd to argue that no composite manager can reasonably use 
the extension when it is currently implemented more or less as 
specified in Xgl, and compiz makes use of it quite beautifully.  
Wouldn't the current semantics plus an extension that did exactly 
what you propose (grab a drawable and prevent other clients from 
rendering to it) be equally useful?  If it is as easy to implement 
as you say it is, all the better.

I'm not saying we don't need to solve the problem of simultaneous 
rendering and texturing, I just want to separate the problems.  
GLX and OpenGL have never been in the business of performing 
complex synchronization between multiple clients (except the 
inherent requirements of synchronizing direct rendering in certain 
cases that you touch on above).  I really don't want to drag it in 
now.  As I said before, this is a general problem that encompasses 
more than this extension, and more than OpenGL.  As such it should 
be solved at another level; as an X extension for example.

> >>The overall desired behave was to give the effect of
> >>conceptually having bind make a copy of the texture.
> >
> > Is this really the overall desire?  Others have argued that
> > Bind operations will be too slow, and they would rather the
> > contents were just updated on the fly with no need to bind more
> > than once.
>
> Yes. Arguments were made to that effect. But if we really want
> to minimize the appearance of partial updates in redirected
> GL rendering, we need these semantics. The group as a whole
> agreed to this. I don't want to see us back pedal now.
>
> > I think the best compromise is to guarantee copy on write
> > only if no other rendering occurs.
>
> This is equivalent to no guarantee at all. This is the same as
> saying that the contents during the bind are completely
> undefined.
>
> But I do know that no guaranteed of stable contents is light
> years better than causing client rendering errors. Some clients
> die when they receive rendering errors. We don't clients randomly
> dying because the composite manager happened to hold a lock on
> their drawables at the time.
>
> >>To disallow user rendering while the drawable is bound as a
> >>texture makes the entire extension unusable.
> >
> > The spec does not disallow user rendering while the drawable is
> > bound.  It allows the implementors to choose whether or not to
> > support it.  This is a compromise.  It makes the extension
> > reasonable to specify, implement, and use.  Implementations
> > that choose to not handle rendering to bound drawables won't
> > work with many existing applications.  In the short term, all
> > the implementations we know of support it to some degree. 
> > Again, in the long term, better synchronization solutions are
> > needed.
>
> If you are going to make the stable-while-bound semantic
> platform specific, then you will need some way for the client
> to figure out whether or not it is supported.

In the interest of keeping things as simple as possible, I think it 
would be best if all users assumed all implementations were not 
"stable-while-bound"

> If my composite manager encounters a device that doesn't support
> stable-while-bound, I probably will just not use the tfp
> extension on that device, choosing instead to revert to using the
> damage/copy mechanism I use now. I would rather have slower,
> artifact-free rendering than fast rendering that has artifacts.

That seems reasonable, and you will always have that option if you 
want to sacrafice some speed.  However, I don't believe copying is 
immune from these artifacts.  It's perfectly reasonable, in theory, 
for some piece of hardware capable of doing blits and rendering 
simultaneously, to damage your drawable while you are using OpenGL 
to copy data out of it into a texture.  Even in current 
hardware/drivers, there is nothing guaranteeing a CopyTexImage call 
is an atomic operation.  It could be broken up into several smaller 
blits, in between which other rendering clients could be scheduled 
in and render to the drawable.  In practice, this would be a rare 
if not nonexistant problem currently, but there isn't anything in 
the specifications currently prohibiting it.  OpenGL operations in 
one process happen out of band from other OpenGL processes' 
operations and X rendering.  Some other synchronization is still 
needed.

> And, if only copy-on-bind devices can provide the
> stable-while-bound semantic, then unless they provide a
> significant speed up over the current damage/copy mechanism
> (which I doubt) then I won't end up using the extension at all.
>
> Before proceeding further, I would suggest that you implement a
> version of the extension in the nvidia driver that does not
> implement stable-while-bound and let's plug it into Looking Glass
> and see how it looks. This will tell us whether it's worth
> jumping through hoops to achieve the stable-while-bound semantic.
> Once we know how bad it is then we will know better how to
> proceed.

I think it would be best to get as close as we can to a consensus 
before shipping something.  Version-reving extensions can be a 
nightmare.  From early testing with our implementation and compiz, 
things look great for the most part.  Is there visible tearing 
sometimes?  Yes.  Is it unuseable?  No, far from it.  Is the 
tearing something we should be forced to live with?  No, definitely 
not.  Let's get this solved, but in some way other than burdening 
this specification.

Thanks,
-James Jones