[Mesa-dev] [PATCH 09/17] mesa/st: add support for waiting for semaphore objects
Marek Olšák
maraeo at gmail.com
Fri Nov 3 18:15:12 UTC 2017
FLUSH_VERTICES should be called by mesa/main if it's inside a GL call.
st/mesa should only flush bitmaps.
Marek
On Fri, Nov 3, 2017 at 6:36 PM, Andres Rodriguez <andresx7 at gmail.com> wrote:
>
>
> On 2017-11-03 05:17 AM, Nicolai Hähnle wrote:
>>
>> On 02.11.2017 04:57, Andres Rodriguez wrote:
>>>
>>> Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject
>>>
>>> Signed-off-by: Andres Rodriguez <andresx7 at gmail.com>
>>> ---
>>> src/mesa/state_tracker/st_cb_semaphoreobjects.c | 28
>>> +++++++++++++++++++++++++
>>> 1 file changed, 28 insertions(+)
>>>
>>> diff --git a/src/mesa/state_tracker/st_cb_semaphoreobjects.c
>>> b/src/mesa/state_tracker/st_cb_semaphoreobjects.c
>>> index 47ece47..4cff3fd 100644
>>> --- a/src/mesa/state_tracker/st_cb_semaphoreobjects.c
>>> +++ b/src/mesa/state_tracker/st_cb_semaphoreobjects.c
>>> @@ -1,5 +1,7 @@
>>> +
>>> #include "main/imports.h"
>>> #include "main/mtypes.h"
>>> +#include "main/context.h"
>>> #include "main/externalobjects.h"
>>> @@ -47,10 +49,36 @@ st_import_semaphoreobj_fd(struct gl_context *ctx,
>>> #endif
>>> }
>>> +static void
>>> +st_server_wait_semaphore(struct gl_context *ctx,
>>> + struct gl_semaphore_object *semObj)
>>> +{
>>> + struct st_semaphore_object *st_obj = st_semaphore_object(semObj);
>>> + struct st_context *st = st_context(ctx);
>>> + struct pipe_context *pipe = st->pipe;
>>> +
>>> + _mesa_flush(ctx);
>>
>>
>> What's the justification of this flush?
>
>
> Thanks for all the review feedback :)
>
> Technically according to the spec we shouldn't need to flush here. But the
> flush is required due to how amdgpu handles syncobjs.
>
> For amdgpu, the wait and signal operations happens on a submission boundary.
> I.e, we can say "wait for this syncobj before executing this submission", or
> "signal this syncobj after this submission completes". But we can't really
> introduce a wait/signal in the middle of a commandstream as the spec
> requires us to do.
>
> This flush (and the one in st_server_signal_semaphore) exist to adjust the
> submission boundaries. If we align the submission boundary with the
> semaphore operation boundary we can accurately emulate the timing of the
> semaphore operations.
>
> In this case, when st_server_wait_semaphore() is called, the user expects
> that any work queued up until this point will not wait for the semaphore.
> Any work submitted after the wait call returns should wait for the semaphore
> to signal. Hence, we call flush so that the amdgpu winsys will submit all
> the current work. Then we queue up the syncobj for future submissions to
> wait on.
>
> In the signal case, we flush so that the syncobj will signal when the
> current work finishes (instead of at "current work + N"). That way we don't
> have any unnecessary latency.
>
> However, I'm not 100% certain my approach here is the best solution. My main
> concern is that an amdgpu specific detail is surfaced in what should be a
> pipe agnostic layer. If you have any recommendations on how to move this
> code into the pipe layer, or achieve the same results with a different
> strategy I'd love to take the advice.
>
> From the performance point of view, I'm not as concerned. Wait/signal tend
> to be aligned at frame boundaries so flushing there shouldn't be a big deal.
> But I do think that improving the code structure from my current iteration
> would have value.
>
>
>>
>>
>>> + pipe->semobj_wait(pipe, st_obj->semaphore);
>>> +}
>>> +
>>> +static void
>>> +st_server_signal_semaphore(struct gl_context *ctx,
>>> + struct gl_semaphore_object *semObj)
>>> +{
>>> + struct st_semaphore_object *st_obj = st_semaphore_object(semObj);
>>> + struct st_context *st = st_context(ctx);
>>> + struct pipe_context *pipe = st->pipe;
>>> +
>>
>>
>> You have to flush state-tracker internal state here. This means
>> FLUSH_VERTICES, st_flush_bitmap, possibly others. I don't think we have this
>> factored out well yet, but see below.
>
>
> Thanks for pointing this out, this actually gives me an idea for getting rid
> of the full flushes, see below.
>
>>
>>
>>> + pipe->semobj_signal(pipe, st_obj->semaphore);
>>> + _mesa_flush(ctx);
>>
>>
>> What's the justification of this flush? Does EXT_external_objects contain
>> any language to guarantee that SignalSemaphoreEXT will "finish in finite
>> time"? If no, the flush should probably not be there. If yes, please add a
>> spec quote.
>
>
> See above for the flush reasoning.
>
>>
>> Furthermore, this whole code section is a strong hint that merging fences
>> and semaphores at the Gallium level is indeed a good idea. Basically, the
>> idea is that you'd end up calling
>>
>> pipe->flush(pipe, &st_obj->semaphore, flags);
>>
>> where flags is PIPE_FLUSH_DEFERRED | PIPE_FLUSH_REUSE_FENCE or
>> PIPE_FLUSH_ASYNC | PIPE_FLUSH_REUSE_FENCE, depending on how the previous
>> paragraph is resolved. In one of my recent series that includes deferred
>> fences for threaded contexts, I'm already doing the equivalent of adding a
>> "PIPE_FLUSH_REUSE_FENCE". We could formalize this as a more broadly
>> applicable feature in the Gallium interface.
>>
>
> Thanks for pointing out PIPE_FLUSH_REUSE_FENCE, I wasn't aware of this flag.
>
> My reasoning for keeping semaphore operations separate is that a different
> pipe implementation might be able to emit a wait/signal packet into the
> middle of the commandstream, with no flushing required.
>
> However, that argument has a giant hole in it. I have a _mesa_flush() which
> completely negates any benefits of a "pipe agnostic" implementation. So in
> this case, as you suggested, I think merging semaphores with re-usable
> fences would be a good idea.
>
> I do have one alternative proposal try to keep the option of
> mid-commandstream semaphores open. How about something like:
>
> flush_state_tracker_internal_state() {
> flush_vertices();
> flush_bitmaps();
> }
>
> st_server_wait_semaphore() {
> flush_state_tracker_internal_state();
> pipe->semobj_wait(pipe, st_obj->semaphore);
>
> // The amdgpu semobj_wait() implementation would
> // then be responsible for adjusting the submission
> // windows internally, i.e. calling flush()
> }
>
>
> st_server_signal_semaphore() {
> flush_state_tracker_internal_state();
> pipe->semobj_signal(pipe, st_obj->semaphore);
>
> // The amdgpu semobj_signal() implementation would
> // then be responsible for adjusting the submission
> // windows internally, i.e. calling flush()
> }
>
> Let me know what you think of this slightly more agnostic implementation.
>
> Sorry for the long reply!
>
> Regards,
> Andres
>
>
>> Cheers,
>> Nicolai
>>
>>
>>> +}
>>> +
>>> void
>>> st_init_semaphoreobject_functions(struct dd_function_table *functions)
>>> {
>>> functions->NewSemaphoreObject = st_semaphoreobj_alloc;
>>> functions->DeleteSemaphoreObject = st_semaphoreobj_free;
>>> functions->ImportSemaphoreFd = st_import_semaphoreobj_fd;
>>> + functions->ServerWaitSemaphoreObject = st_server_wait_semaphore;
>>> + functions->ServerSignalSemaphoreObject = st_server_signal_semaphore;
>>> }
>>>
>>
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list