[Mesa-dev] [PATCH 16/21] anv: Implement VK_KHX_external_semaphore_fd

Chad Versace chadversary at chromium.org
Wed May 3 19:04:59 UTC 2017


On Wed 03 May 2017, Jason Ekstrand wrote:
> On Tue, May 2, 2017 at 5:15 PM, Chad Versace <[1]chadversary at chromium.org>
> wrote:
> 
>     On Fri 14 Apr 2017, Jason Ekstrand wrote:
>     > This implementation allocates a 4k BO for each semaphore that can be
>     > exported using OPAQUE_FD and uses the kernel's already-existing
>     > synchronization mechanism on BOs.
>     > ---
>     >  src/intel/vulkan/anv_batch_chain.c      |  53 ++++++++++--
>     >  src/intel/vulkan/anv_device.c           |   4 +
>     >  src/intel/vulkan/anv_entrypoints_gen.py |   1 +
>     >  src/intel/vulkan/anv_private.h          |  16 +++-
>     >  src/intel/vulkan/anv_queue.c            | 141
>     ++++++++++++++++++++++++++++++--
>     >  5 files changed, 199 insertions(+), 16 deletions(-)
> 
>     [snip]
> 
>     > @@ -513,14 +533,33 @@ VkResult anv_CreateSemaphore(
>     >      VkExternalSemaphoreHandleTypeFlagsKHX handleTypes =
>     >        export ? export->handleTypes : 0;
>     >
>     > -   /* External semaphores are not yet supported */
>     > -   assert(handleTypes == 0);
>     > +   if (handleTypes == 0) {
>     > +      /* The DRM execbuffer ioctl always execute in-oder so long as you
>     stay
>     > +       * on the same ring.  Since we don't expose the blit engine as a
>     DMA
>     > +       * queue, a dummy no-op semaphore is a perfectly valid
>     implementation.
>     > +       */
>     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DUMMY;
>     > +   } else if (handleTypes & VK_EXTERNAL_SEMAPHORE_HANDLE_
>     TYPE_OPAQUE_FD_BIT_KHX) {
>     > +      assert(handleTypes == VK_EXTERNAL_SEMAPHORE_HANDLE_
>     TYPE_OPAQUE_FD_BIT_KHX);
>     > +
>     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_BO;
>     > +      VkResult result = anv_bo_cache_alloc(device, &device->bo_cache,
>     > +                                           4096, &semaphore->[2]
>     permanent.bo);
>     > +      if (result != VK_SUCCESS) {
>     > +         vk_free2(&device->alloc, pAllocator, semaphore);
>     > +         return result;
>     > +      }
>     > +
>     > +      /* If we're going to use this as a fence, we need to *not* have
>     the
>     > +       * EXEC_OBJECT_ASYNC bit set.
>     > +       */
>     > +      semaphore->permanent.bo->flags &= ~EXEC_OBJECT_ASYNC;
>     > +   } else {
>     > +      assert(!"Unknown handle type");
>     > +      vk_free2(&device->alloc, pAllocator, semaphore);
>     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> 
>     The Vulkan 1.0.49 spec does not list
>     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX as a possible error code for
>     vkCreateSemaphore. The reason is that
>     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX indicates that an external handle
>     does not match the expected handle type. Since vkCreateSemaphore has no
>     parameter that's an external handle, it can't return the error.
>     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX only makes sense for functions that
>     import a handle or query the properties of a handle.
> 
>     This 'else' branch is capturing invalid application usage the API. Since
>     invalid usages leads to undefined behavior, we could claim that Mesa's
>     undefined behavior is to return a helpful error code. But...
> 
>     I much prefer disastrous undefined behavior here instead of helpful
>     undefined behavior. If we return VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX,
>     then appsy may accidentally rely on our out-of-spec behavior. It would
>     be better to just abort() or, like anvil does in most cases, silently
>     ignore the invalid case and continue blindly ahead.
> 
> 
> They do get disastrous behavior in debug builds because of the assert.  In
> release builds, however, I'm not sure what I think.  I don't know that I like
> the idea of silently continuing because it would most likely give them a
> perfectly "valid" dummy semaphore and they would get an error much later when
> they try to export from it.  We could abort().  I don't know.  I really don't
> think apps will rely on that error.

After sleeping on this, I also don't know what the best behavior is.
I still *prefer* that vkCreateSemaphore fail with disaster rather than
fail with a helpful error code, but I'm no longer *confident* that's the
right choice.

More below.

>     [snip]

>     > +static void
>     > +anv_semaphore_impl_cleanup(struct anv_device *device,
>     > +                           struct anv_semaphore_impl *impl)
>     > +{
>     > +   switch (impl->type) {
>     > +   case ANV_SEMAPHORE_TYPE_NONE:
>     > +   case ANV_SEMAPHORE_TYPE_DUMMY:
>     > +      /* Dummy.  Nothing to do */
>     > +      break;
>     > +
>     > +   case ANV_SEMAPHORE_TYPE_BO:
>     > +      anv_bo_cache_release(device, &device->bo_cache, impl->bo);
>     > +      break;
>     > +
>     > +   default:
>     > +      unreachable("Invalid semaphore type");
>     > +   }
> 
>     This switch statement is a good candidate for exhaustive switch warnings
>     (-Wswitch).  But we don't get the warnings unless the default is dropped
>     and the unreachable is moved to after the switch statement.
> 
> 
> Sure, I can do that.

Cool.

>     >
>     > @@ -548,9 +609,73 @@ void anv_GetPhysicalDeviceExternalSemap
>     horePropertiesKHX(
>     >      VkExternalSemaphorePropertiesKHX*         
>      pExternalSemaphoreProperties)
>     >  {
>     >     switch (pExternalSemaphoreInfo->handleType) {
>     > +   case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX:
>     > +      pExternalSemaphoreProperties->exportFromImportedHandleTypes = 0;
> 
>     I expected exportFromImportedHandleTypes to be
>     VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX.  Why is it 0?
>     I admit I don't fully understand the VK_KHX_external_semaphore spec. (To
>     be fair, I doubt anyone fully understands it).
> 
> 
> I didn't figure there was any reason to allow re-export but there's no harm in
> it for OPAQUE_FD.

I doesn't matter to me either way. It seems odd to me to not allow
re-export, but I don't think it really matters unless we discover apps
that we really want it.

>     > +      pExternalSemaphoreProperties->compatibleHandleTypes =
>     > +         VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX;
>     > +      pExternalSemaphoreProperties->externalSemaphoreFeatures =
>     > +         VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHX |
>     > +         VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHX;
>     > +      break;
>     > +
>     >     default:
>     >        pExternalSemaphoreProperties->exportFromImportedHandleTypes = 0;
>     >        pExternalSemaphoreProperties->compatibleHandleTypes = 0;
>     >        pExternalSemaphoreProperties->externalSemaphoreFeatures = 0;
>     >     }
>     >  }
>     > +
>     > +VkResult anv_ImportSemaphoreFdKHX(
>     > +    VkDevice                                    _device,
>     > +    const VkImportSemaphoreFdInfoKHX*           pImportSemaphoreFdInfo)
>     > +{
>     > +   ANV_FROM_HANDLE(anv_device, device, _device);
>     > +   ANV_FROM_HANDLE(anv_semaphore, semaphore, pImportSemaphoreFdInfo->
>     semaphore);
>     > +
>     > +   switch (pImportSemaphoreFdInfo->handleType) {
>     > +   case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX: {
>     > +      struct anv_bo *bo;
>     > +      VkResult result = anv_bo_cache_import(device, &device->bo_cache,
>     > +                                            pImportSemaphoreFdInfo->fd,
>     4096,
>     > +                                            &bo);
>     > +      if (result != VK_SUCCESS)
>     > +         return result;
>     > +
>     > +      /* If we're going to use this as a fence, we need to *not* have
>     the
>     > +       * EXEC_OBJECT_ASYNC bit set.
>     > +       */
>     > +      bo->flags &= ~EXEC_OBJECT_ASYNC;
>     > +
>     > +      anv_semaphore_impl_cleanup(device, &semaphore->permanent);
>     > +
>     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_BO;
>     > +      semaphore->[3]permanent.bo = bo;
>     > +
>     > +      return VK_SUCCESS;
>     > +   }
>     > +
>     > +   default:
>     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> 
>     Yep. VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX is required here.
>    
>     > +   }
>     > +}
> 
>     > +
>     > +VkResult anv_GetSemaphoreFdKHX(
>     > +    VkDevice                                    _device,
>     > +    VkSemaphore                                 _semaphore,
>     > +    VkExternalSemaphoreHandleTypeFlagBitsKHX    handleType,
>     > +    int*                                        pFd)
>     > +{
>     > +   ANV_FROM_HANDLE(anv_device, device, _device);
>     > +   ANV_FROM_HANDLE(anv_semaphore, semaphore, _semaphore);
>     > +
>     > +   switch (semaphore->permanent.type) {
>     > +   case ANV_SEMAPHORE_TYPE_BO:
>     > +      return anv_bo_cache_export(device, &device->bo_cache,
>     > +                                 semaphore->[4]permanent.bo, pFd);
>     > +
>     > +   default:
>     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> 
>     Like vkCreateSemaphore, for the same reasons,
>     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX is not a valid error code for
>     vkGetSemaphoreFdKHX. And like vkCreateSemaphore, I think disastrous
>     undefined behavior here is better than helpful undefined behavior.
> 
> 
> Hrm... Maybe?  We could abort(). What I don't want to do is just get rid of the
> case and silently return a random return value (which may be VK_SUCCES!)
> 
> As a side-note, I'm beginning to come to the conclusion that we may want to
> make the undefined behavior in the driver a bit more contained.  While our
> driver shouldn't be robust against API misuse, I would like it to be
> deterministic and try to crash early rather than just corrupting something and
> maybe crashing later in a weird way.

I was thinking the same thing this morning during the Khronos telco.

Out of curiosity, I looked up the spec language about undefined
behavior:

    The core layer assumes applications are using the API correctly. Except
    as documented elsewhere in the Specification, the behavior of the core
    layer to an application using the API incorrectly is undefined, and may
    include program termination. However, implementations must ensure that
    incorrect usage by an application does not affect the integrity of the
    operating system, the Vulkan implementation, or other Vulkan client
    applications in the system, and does not allow one application to access
    data belonging to another application.

Interestingly, the spec requires that "implementations must ensure that
incorrect usage [...] does not affect the integrity of [...] the Vulkan
implementation". Interpreting that text strictly, I believe that Anvil's
current method of handling undefined behavior (pretend it didn't happen
and probably scribble onto out-of-bound memory eventually) violates the
spec. However, perhaps the spec author's intent differs from the strict
interpretation. I'm unsure.

Undefined behavior is difficult to debug and reason about, not only in
running code, but also in Khronos specs :)

How do *you* think we should handle the invalid usage? I've stated my
preference, but it's just a preference.

...

Kill the default case for -Wswitch, and this patch is
Reviewed-by: Chad Versace <chadversary at chromium.org>
regardless of how you decide to handle the invalid usage.

Changing exportFromImportedHandleTypes does not affect any other hunk in
the patch, so my r-b is also independent of what value you choose for
that.


More information about the mesa-dev mailing list