[Mesa-dev] [PATCH 3/5] st/mesa: use asynchronous flushes

Nicolai Hähnle nhaehnle at gmail.com
Fri Nov 3 16:34:13 UTC 2017


On 31.10.2017 18:59, Marek Olšák wrote:
> On Sun, Oct 22, 2017 at 9:18 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>
>> ---
>>   src/mesa/state_tracker/st_cb_flush.c   |  4 ++--
>>   src/mesa/state_tracker/st_cb_syncobj.c | 26 ++++++++++++++++++++++++--
>>   2 files changed, 26 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_cb_flush.c b/src/mesa/state_tracker/st_cb_flush.c
>> index c8452d0e6f7..5f4e2ac3cc1 100644
>> --- a/src/mesa/state_tracker/st_cb_flush.c
>> +++ b/src/mesa/state_tracker/st_cb_flush.c
>> @@ -56,21 +56,21 @@ void st_flush(struct st_context *st,
>>   }
>>
>>
>>   /**
>>    * Flush, and wait for completion.
>>    */
>>   void st_finish( struct st_context *st )
>>   {
>>      struct pipe_fence_handle *fence = NULL;
>>
>> -   st_flush(st, &fence, 0);
>> +   st_flush(st, &fence, PIPE_FLUSH_ASYNC | PIPE_FLUSH_HINT_FINISH);
>>
>>      if(fence) {
>>         st->pipe->screen->fence_finish(st->pipe->screen, NULL, fence,
>>                                        PIPE_TIMEOUT_INFINITE);
>>         st->pipe->screen->fence_reference(st->pipe->screen, &fence, NULL);
>>      }
>>
>>      st_manager_flush_swapbuffers();
>>   }
>>
>> @@ -81,21 +81,21 @@ void st_finish( struct st_context *st )
>>    */
>>   static void st_glFlush(struct gl_context *ctx)
>>   {
>>      struct st_context *st = st_context(ctx);
>>
>>      /* Don't call st_finish() here.  It is not the state tracker's
>>       * responsibilty to inject sleeps in the hope of avoiding buffer
>>       * synchronization issues.  Calling finish() here will just hide
>>       * problems that need to be fixed elsewhere.
>>       */
>> -   st_flush(st, NULL, 0);
>> +   st_flush(st, NULL, PIPE_FLUSH_ASYNC);
>>
>>      st_manager_flush_frontbuffer(st);
>>   }
>>
>>
>>   /**
>>    * Called via ctx->Driver.Finish()
>>    */
>>   static void st_glFinish(struct gl_context *ctx)
>>   {
>> diff --git a/src/mesa/state_tracker/st_cb_syncobj.c b/src/mesa/state_tracker/st_cb_syncobj.c
>> index 637fbe3b73a..44323b4750a 100644
>> --- a/src/mesa/state_tracker/st_cb_syncobj.c
>> +++ b/src/mesa/state_tracker/st_cb_syncobj.c
>> @@ -123,22 +123,44 @@ static void st_client_wait_sync(struct gl_context *ctx,
>>
>>   static void st_check_sync(struct gl_context *ctx, struct gl_sync_object *obj)
>>   {
>>      st_client_wait_sync(ctx, obj, 0, 0);
>>   }
>>
>>   static void st_server_wait_sync(struct gl_context *ctx,
>>                                   struct gl_sync_object *obj,
>>                                   GLbitfield flags, GLuint64 timeout)
>>   {
>> -   /* NO-OP.
>> -    * Neither Gallium nor DRM interfaces support blocking on the GPU. */
>> +   struct pipe_context *pipe = st_context(ctx)->pipe;
>> +   struct pipe_screen *screen = pipe->screen;
>> +   struct st_sync_object *so = (struct st_sync_object*)obj;
>> +   struct pipe_fence_handle *fence = NULL;
>> +
>> +   /* Nothing needs to be done here if the driver does not support async
>> +    * flushes. */
>> +   if (!pipe->fence_server_sync)
>> +      return;
>> +
>> +   /* If the fence doesn't exist, assume it's signalled. */
>> +   mtx_lock(&so->mutex);
>> +   if (!so->fence) {
>> +      mtx_unlock(&so->mutex);
>> +      so->b.StatusFlag = GL_TRUE;
>> +      return;
>> +   }
>> +
>> +   /* We need a local copy of the fence pointer. */
>> +   screen->fence_reference(screen, &fence, so->fence);
>> +   mtx_unlock(&so->mutex);
>> +
>> +   pipe->fence_server_sync(pipe, fence);
>> +   screen->fence_reference(screen, &fence, NULL);
>>   }
> 
> What's the reason for implementing st_server_wait_sync?

We could have a program with two threads, each with its own context, 
doing the following:

  Thread 1                     Thread 2
  --------                     --------
  sync = glFenceSync(...)
  glFlush(...)
                               glWaitSync(sync, ...)

With synchronous flushes, glWaitSync(...) could be a no-op, since at the 
time the glFlush returns, any command buffers for commands prior to 
glFenceSync would already be in-flight, and there's no way for thread 2 
to "overtake" them.

With asynchronous flushes, the command buffers are not necessarily in 
flight yet when glFlush returns. The only guarantee is that the driver 
thread for context 1 is about to process those commands, but it could be 
overtaken by the driver thread for context 2.

Now that I think about this again, si_fence_server_sync still isn't 
strict enough for this case: it waits for the fence to become ready 
(which happens in the deferred flush that was queued by glFenceSync), 
but it doesn't wait for the command buffer submission (which only 
happens with the non-deferred flush that was queued by glFlush).

So si_fence_server_sync does have to properly wait for unflushed fences 
as well.

This actually looks like it might be an old bug. Here's another possibility:

  Thread 1                     Thread 2
  --------                     --------
  sync = glFenceSync(...)
                               glWaitSync(sync, ...)
  glFlush(...)

I believe this is allowed, and it would fail in current Git master. We 
could fix this (both cases) by checking for unflushed fences:

- if the current context is the same as the unflushed context, we can 
return immediately
- otherwise, we need to wait for the unflushed context to be flushed

We could try to be even fancier and add the unflushed context's future 
fence as a dependency to the waiting context's command submission, but 
that seems overly complicated at least for now.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list