[RFC 0/9] gbm_surface

Tue Jan 17 01:27:02 PST 2012

On 12/21/2011 04:52 AM, Kristian Høgsberg wrote:
> On Wed, Dec 14, 2011 at 4:59 AM, Ander Conselvan de Oliveira
> <conselvan2 at gmail.com>  wrote:
>> From: Ander Conselvan de Oliveira<ander.conselvan.de.oliveira at intel.com>
>>
>> Hi,
>>
>> So, in London we discussed moving the drm compositor to a new
>> gbm_surface interface, which would allow the buffer management
>> to be handled by the egl implementation by calling eglSwapBuffers.
>>
>> Here's a prototype implementation of such an interface in mesa and a
>> patch to wayland-demos moving compositor-drm to it. With the changes,
>> the compositor creates a gbm_surface for each output and from that it
>> creates egl surfaces.
>>
>> The compositor is still responsible for the actual flipping. In order
>> to do that, the new interface includes a function for getting a bo for
>> a gbm_surface's front buffer. This change does not impact the ability
>> of scanning out a client buffer.
>
> Hi Ander,
>
> Thanks for getting a prototype implementation up so fast.  It looks
> pretty good, but I think there are a few corner cases we need to think
> about.  Apologies for the long, rambling email.  First of, I talked to
> Robert Bragg in IRC a bit and he had a few concerns/ideas:
>
>   - renaming gbm_surface_get_bo() to gbm_surface_lock_bo()?  I like it;
> it's a little more specific and pairs better with
> gbm_surface_release_bo().

Or even gbm_surface_lock_front_buffer(). It seems that some confusion on 
what this function does led to points 4 and 6 below.

> My concerns are mainly around sequence and ordering of eglSwapBuffer,
> gbm_surface_lock_buffer and gbm_surface_release_buffer:
>
> 4) What happens if we do two or more swapbuffers without locking the
> resulting bo?  Obviously we can't lock more buffers than we've
> "generated" with eglSwapBuffer, so that has to be an error.  But we
> could eglSwapBuffer a few times before we do
> gbm_surface_lock_buffer().  Is that useful and should that be legal?
> Or should we require that eglSwapBuffer must always be followed by a
> gbm_surface_lock_buffer() before you can do another eglSwapBuffer?

The way I implemented this, lock_buffer() always returns the surface's 
front buffer, i.e., there's no queuing. The user may call eglSwapBuffers 
twice without calling lock, but that means a frame will be skipped.

Off the top of my head, I can't see any case that would benefit from 
having the buffers queue up in the surface while it leads to added 
complexity.

Note that the rendering of next frame can start independently of the 
locking of the front buffer. The only restraint is that the front buffer 
should be locked before the next call to eglSwapBuffers.

> 5) We can't block rendering if there are no buffers to render to - we
> just don't have a mechanism to do that.  If we try to render and all
> the buffers are locked, we can either allocate a new buffer, overwrite
> a locked buffer, just discard the rendering, but it's probably
> actually just an error case, really.  So to avoid the client hitting
> this error case, we need a gbm_surface_can_render() type of function.
> If we knew how many buffers the surface was using, we could keep track
> ourselves, but the point is to avoid that.  So before rendering, the
> client can check gbm_surface_can_render() (it really needs a better
> name) and if that's true, it knows it can render a new frame.  And I
> think we should require that even after releasing a bo.  You could
> argue that the client knows there's a buffer available just after
> releasing one back to the surface, but I think we need a little
> flexibility there to avoid assumption about the implementation and the
> number of buffers.

Perhaps gbm_surface_is_back_buffer_locked()? This way the gbm_surface 
interface always assumes there's always at least a front and a back 
buffer, but the implementation is free to use more than that.

> 6) Ordering of client buffers and gbm_surface buffers.  If we require
> that eglSwapBuffer must be followed by gbm_surface_lock_buffer()
> before you can render again (and I think we should), this isn't an
> issue.  So this is probably mostly hypothetical, but here it goes: if
> we allowed the sequence eglSwapBuffer, eglSwapBuffer,
> gbm_surface_lock_bo(), then there would still be a bo queued up in the
> surface that we would get on the next lock bo call.  As we mix the gbm
> surface with pageflipping to client buffers, we need to make sure that
> the sequence of buffers that we flip to is what we want it to be.  If
> we pageflip to a client buffer with a stale buffer sitting in the
> gbm_surface, we'll get an out-of-order frame as we eventually go back
> to pageflipping to the gbm surface buffers.  This is basically the
> reason I think we should allow this and mandate that you must lock the
> buffer you just rendered befure starting a new one.  An alternative
> solution would be to have a way to push client buffers into the
> gbm_surface, so that they appear in the same stream of buffers, in the
> expected order, but that seems much more complicated than just
> requiring that buffer can't queue up in the surface.

I agree that there should be no queuing up in the surface.

> 7) Finally, we need to change the rendering loop to actually make
> triple buffering work the way we want.  Right now we always render in
> response to the pageflip event.  Which means that when we can't render
> a frame in less that the frame time, we end up dropping to half the
> refresh rate.  Triple buffering can help us in two ways, depending on
> the hardware: either it will just let us render as fast as possible
> (ie, we get 50fps when the render time is 20ms), or if the gl stack
> has a deep pipeline that lets us overlap rendering of two frames (eg,
> a frame first takes 10ms of cpu processing, then 10ms on the gpu), in
> which case triple buffering could let us hit 60fps (ie, it takes 20ms
> to render one frame, but because of pipelining, we can produce one
> every 10ms).  Either way, from the compositors point of view, the two
> cases are the same: instead of rendering whenever the pageflip is
> done, we need to render whenever we can (thus
> gbm_surface_can_render()).  So essentially, when we're done with one
> frame, we ask can_render() right away, and if true, start rendering
> the next frame.If we can't render, we have to wait until a pageflip
> finishes and we release a buffer, at which point we ask can_render()
> again, and render the next frame if true.  For a double buffered gbm
> surface, this is the same behavior as we have now.

When we draw a frame we consume all the damage accumulated so we can't 
just start rendering one frame after the other. We need to wait until 
more damage comes in. To triple buffer we need to change 
schedule_repaint to trigger the rendering of second frame while we are 
still waiting for the current one to be flipped.

There's also something I missed before. We can be sure that the rbo's 
contents don't change but with the gbm surface we need to ste 
EGL_SWAP_BEHAVIOR_PRESERVED_BIT to EGL_BUFFER_PRESERVED (for the double 
buffered case). For triple buffering I think we want/need Robert Bragg's 
EGL_INTEL_buffer_age extension.

> However, if we unconditionally enable triple buffering, we'll render
> an extra frame ahead for the fast case (ie when we could do 60fps
> without triple buffering), so there's a latency penalty and we're
> wasting memory on a third bo we don't need.  So we need to be able to
> enable and disable triple buffering as needed: in some cases we'll
> never need it, other cases we need it all the time, and some cases
> we'll need to turn it on and off as the load changes.  We can either
> have the compositor detect when it's dropping frames and provide an
> explicit gbm API call to enable/disable triple buffering or perhaps to
> increase and decrease the number of buffers in use.  Or we can provide
> the gbm surface with the bo presentation time when we release it back
> to the surface and have the surface figure out when we're dropping
> frames and automatically manage the buffer count.  However for that to
> work, the surface also needs to know the refresh rate and maybe more.

With the changes needed in schedule_repaint and friends, I think we 
could decide to do triple buffering or not on the compositor. If we 
don't want to triple buffering we simply don't start rendering before 
the current flip is complete. We only allocate a third buffer on the EGL 
side if one is actually needed and we free the third buffer if we detect 
that both the back and third buffers are free so this should not incur 
in memory overhead.

Regards,
Ander
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.