[Mesa-dev] [PATCH 16/21] anv: Implement VK_KHX_external_semaphore_fd

Jason Ekstrand jason at jlekstrand.net
Wed May 3 19:42:45 UTC 2017


On Wed, May 3, 2017 at 12:04 PM, Chad Versace <chadversary at chromium.org>
wrote:

> On Wed 03 May 2017, Jason Ekstrand wrote:
> > On Tue, May 2, 2017 at 5:15 PM, Chad Versace <[1]
> chadversary at chromium.org>
> > wrote:
> >
> >     On Fri 14 Apr 2017, Jason Ekstrand wrote:
> >     > This implementation allocates a 4k BO for each semaphore that can
> be
> >     > exported using OPAQUE_FD and uses the kernel's already-existing
> >     > synchronization mechanism on BOs.
> >     > ---
> >     >  src/intel/vulkan/anv_batch_chain.c      |  53 ++++++++++--
> >     >  src/intel/vulkan/anv_device.c           |   4 +
> >     >  src/intel/vulkan/anv_entrypoints_gen.py |   1 +
> >     >  src/intel/vulkan/anv_private.h          |  16 +++-
> >     >  src/intel/vulkan/anv_queue.c            | 141
> >     ++++++++++++++++++++++++++++++--
> >     >  5 files changed, 199 insertions(+), 16 deletions(-)
> >
> >     [snip]
> >
> >     > @@ -513,14 +533,33 @@ VkResult anv_CreateSemaphore(
> >     >      VkExternalSemaphoreHandleTypeFlagsKHX handleTypes =
> >     >        export ? export->handleTypes : 0;
> >     >
> >     > -   /* External semaphores are not yet supported */
> >     > -   assert(handleTypes == 0);
> >     > +   if (handleTypes == 0) {
> >     > +      /* The DRM execbuffer ioctl always execute in-oder so long
> as you
> >     stay
> >     > +       * on the same ring.  Since we don't expose the blit engine
> as a
> >     DMA
> >     > +       * queue, a dummy no-op semaphore is a perfectly valid
> >     implementation.
> >     > +       */
> >     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DUMMY;
> >     > +   } else if (handleTypes & VK_EXTERNAL_SEMAPHORE_HANDLE_
> >     TYPE_OPAQUE_FD_BIT_KHX) {
> >     > +      assert(handleTypes == VK_EXTERNAL_SEMAPHORE_HANDLE_
> >     TYPE_OPAQUE_FD_BIT_KHX);
> >     > +
> >     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_BO;
> >     > +      VkResult result = anv_bo_cache_alloc(device,
> &device->bo_cache,
> >     > +                                           4096, &semaphore->[2]
> >     permanent.bo);
> >     > +      if (result != VK_SUCCESS) {
> >     > +         vk_free2(&device->alloc, pAllocator, semaphore);
> >     > +         return result;
> >     > +      }
> >     > +
> >     > +      /* If we're going to use this as a fence, we need to *not*
> have
> >     the
> >     > +       * EXEC_OBJECT_ASYNC bit set.
> >     > +       */
> >     > +      semaphore->permanent.bo->flags &= ~EXEC_OBJECT_ASYNC;
> >     > +   } else {
> >     > +      assert(!"Unknown handle type");
> >     > +      vk_free2(&device->alloc, pAllocator, semaphore);
> >     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> >
> >     The Vulkan 1.0.49 spec does not list
> >     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX as a possible error code for
> >     vkCreateSemaphore. The reason is that
> >     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX indicates that an external
> handle
> >     does not match the expected handle type. Since vkCreateSemaphore has
> no
> >     parameter that's an external handle, it can't return the error.
> >     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX only makes sense for functions
> that
> >     import a handle or query the properties of a handle.
> >
> >     This 'else' branch is capturing invalid application usage the API.
> Since
> >     invalid usages leads to undefined behavior, we could claim that
> Mesa's
> >     undefined behavior is to return a helpful error code. But...
> >
> >     I much prefer disastrous undefined behavior here instead of helpful
> >     undefined behavior. If we return VK_ERROR_INVALID_EXTERNAL_
> HANDLE_KHX,
> >     then appsy may accidentally rely on our out-of-spec behavior. It
> would
> >     be better to just abort() or, like anvil does in most cases, silently
> >     ignore the invalid case and continue blindly ahead.
> >
> >
> > They do get disastrous behavior in debug builds because of the assert.
> In
> > release builds, however, I'm not sure what I think.  I don't know that I
> like
> > the idea of silently continuing because it would most likely give them a
> > perfectly "valid" dummy semaphore and they would get an error much later
> when
> > they try to export from it.  We could abort().  I don't know.  I really
> don't
> > think apps will rely on that error.
>
> After sleeping on this, I also don't know what the best behavior is.
> I still *prefer* that vkCreateSemaphore fail with disaster rather than
> fail with a helpful error code, but I'm no longer *confident* that's the
> right choice.
>
> More below.
>
> >     [snip]
>
> >     > +static void
> >     > +anv_semaphore_impl_cleanup(struct anv_device *device,
> >     > +                           struct anv_semaphore_impl *impl)
> >     > +{
> >     > +   switch (impl->type) {
> >     > +   case ANV_SEMAPHORE_TYPE_NONE:
> >     > +   case ANV_SEMAPHORE_TYPE_DUMMY:
> >     > +      /* Dummy.  Nothing to do */
> >     > +      break;
> >     > +
> >     > +   case ANV_SEMAPHORE_TYPE_BO:
> >     > +      anv_bo_cache_release(device, &device->bo_cache, impl->bo);
> >     > +      break;
> >     > +
> >     > +   default:
> >     > +      unreachable("Invalid semaphore type");
> >     > +   }
> >
> >     This switch statement is a good candidate for exhaustive switch
> warnings
> >     (-Wswitch).  But we don't get the warnings unless the default is
> dropped
> >     and the unreachable is moved to after the switch statement.
> >
> >
> > Sure, I can do that.
>
> Cool.
>
> >     >
> >     > @@ -548,9 +609,73 @@ void anv_GetPhysicalDeviceExternalSemap
> >     horePropertiesKHX(
> >     >      VkExternalSemaphorePropertiesKHX*
> >      pExternalSemaphoreProperties)
> >     >  {
> >     >     switch (pExternalSemaphoreInfo->handleType) {
> >     > +   case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX:
> >     > +      pExternalSemaphoreProperties->exportFromImportedHandleTypes
> = 0;
> >
> >     I expected exportFromImportedHandleTypes to be
> >     VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX.  Why is it 0?
> >     I admit I don't fully understand the VK_KHX_external_semaphore spec.
> (To
> >     be fair, I doubt anyone fully understands it).
> >
> >
> > I didn't figure there was any reason to allow re-export but there's no
> harm in
> > it for OPAQUE_FD.
>
> I doesn't matter to me either way. It seems odd to me to not allow
> re-export, but I don't think it really matters unless we discover apps
> that we really want it.
>
> >     > +      pExternalSemaphoreProperties->compatibleHandleTypes =
> >     > +         VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX;
> >     > +      pExternalSemaphoreProperties->externalSemaphoreFeatures =
> >     > +         VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHX |
> >     > +         VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHX;
> >     > +      break;
> >     > +
> >     >     default:
> >     >        pExternalSemaphoreProperties->exportFromImportedHandleTypes
> = 0;
> >     >        pExternalSemaphoreProperties->compatibleHandleTypes = 0;
> >     >        pExternalSemaphoreProperties->externalSemaphoreFeatures =
> 0;
> >     >     }
> >     >  }
> >     > +
> >     > +VkResult anv_ImportSemaphoreFdKHX(
> >     > +    VkDevice                                    _device,
> >     > +    const VkImportSemaphoreFdInfoKHX*
>  pImportSemaphoreFdInfo)
> >     > +{
> >     > +   ANV_FROM_HANDLE(anv_device, device, _device);
> >     > +   ANV_FROM_HANDLE(anv_semaphore, semaphore,
> pImportSemaphoreFdInfo->
> >     semaphore);
> >     > +
> >     > +   switch (pImportSemaphoreFdInfo->handleType) {
> >     > +   case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHX: {
> >     > +      struct anv_bo *bo;
> >     > +      VkResult result = anv_bo_cache_import(device,
> &device->bo_cache,
> >     > +
> pImportSemaphoreFdInfo->fd,
> >     4096,
> >     > +                                            &bo);
> >     > +      if (result != VK_SUCCESS)
> >     > +         return result;
> >     > +
> >     > +      /* If we're going to use this as a fence, we need to *not*
> have
> >     the
> >     > +       * EXEC_OBJECT_ASYNC bit set.
> >     > +       */
> >     > +      bo->flags &= ~EXEC_OBJECT_ASYNC;
> >     > +
> >     > +      anv_semaphore_impl_cleanup(device, &semaphore->permanent);
> >     > +
> >     > +      semaphore->permanent.type = ANV_SEMAPHORE_TYPE_BO;
> >     > +      semaphore->[3]permanent.bo = bo;
> >     > +
> >     > +      return VK_SUCCESS;
> >     > +   }
> >     > +
> >     > +   default:
> >     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> >
> >     Yep. VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX is required here.
> >
> >     > +   }
> >     > +}
> >
> >     > +
> >     > +VkResult anv_GetSemaphoreFdKHX(
> >     > +    VkDevice                                    _device,
> >     > +    VkSemaphore                                 _semaphore,
> >     > +    VkExternalSemaphoreHandleTypeFlagBitsKHX    handleType,
> >     > +    int*                                        pFd)
> >     > +{
> >     > +   ANV_FROM_HANDLE(anv_device, device, _device);
> >     > +   ANV_FROM_HANDLE(anv_semaphore, semaphore, _semaphore);
> >     > +
> >     > +   switch (semaphore->permanent.type) {
> >     > +   case ANV_SEMAPHORE_TYPE_BO:
> >     > +      return anv_bo_cache_export(device, &device->bo_cache,
> >     > +                                 semaphore->[4]permanent.bo,
> pFd);
> >     > +
> >     > +   default:
> >     > +      return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> >
> >     Like vkCreateSemaphore, for the same reasons,
> >     VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX is not a valid error code for
> >     vkGetSemaphoreFdKHX. And like vkCreateSemaphore, I think disastrous
> >     undefined behavior here is better than helpful undefined behavior.
> >
> >
> > Hrm... Maybe?  We could abort(). What I don't want to do is just get rid
> of the
> > case and silently return a random return value (which may be VK_SUCCES!)
> >
> > As a side-note, I'm beginning to come to the conclusion that we may want
> to
> > make the undefined behavior in the driver a bit more contained.  While
> our
> > driver shouldn't be robust against API misuse, I would like it to be
> > deterministic and try to crash early rather than just corrupting
> something and
> > maybe crashing later in a weird way.
>
> I was thinking the same thing this morning during the Khronos telco.
>
> Out of curiosity, I looked up the spec language about undefined
> behavior:
>
>     The core layer assumes applications are using the API correctly. Except
>     as documented elsewhere in the Specification, the behavior of the core
>     layer to an application using the API incorrectly is undefined, and may
>     include program termination. However, implementations must ensure that
>     incorrect usage by an application does not affect the integrity of the
>     operating system, the Vulkan implementation, or other Vulkan client
>     applications in the system, and does not allow one application to
> access
>     data belonging to another application.
>
> Interestingly, the spec requires that "implementations must ensure that
> incorrect usage [...] does not affect the integrity of [...] the Vulkan
> implementation". Interpreting that text strictly, I believe that Anvil's
> current method of handling undefined behavior (pretend it didn't happen
> and probably scribble onto out-of-bound memory eventually) violates the
> spec. However, perhaps the spec author's intent differs from the strict
> interpretation. I'm unsure.
>

Assert-failing, null pointer dereferences, and the like I think are
perfectly reasonable along with GPU hangs.  Less desirable is out-of-bounds
array access, reading uninitialized memory, and other non-deterministic
things.  Writing out-of-bounds memory and thereby corrupting driver
internals is probably something we should try to avoid.


> Undefined behavior is difficult to debug and reason about, not only in
> running code, but also in Khronos specs :)
>
> How do *you* think we should handle the invalid usage? I've stated my
> preference, but it's just a preference.
>

In this particular case, I went through most of the same mental gymnastics
that you're going through right now and came up with what's in the patch.
This is one of those cases where we can easily bail with very little cost
and keep everything reasonably safe.

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170503/65941be8/attachment-0001.html>


More information about the mesa-dev mailing list