[Mesa-dev] [PATCH 1/3] anv: Stop racing relocation offsets

Chris Wilson chris at chris-wilson.co.uk
Thu May 11 16:56:16 UTC 2017


On Thu, May 11, 2017 at 09:27:05AM -0700, Jason Ekstrand wrote:
>    On Thu, May 11, 2017 at 12:41 AM, Chris Wilson
>    <[1]chris at chris-wilson.co.uk> wrote:
> 
>      On Wed, May 10, 2017 at 04:08:53PM -0700, Jason Ekstrand wrote:
>      > One of the key invariants of the relocation system is the
>      > presumed_offset field.  The assumption is made that the value
>      currently
>      > in the address to be relocated agrees with the presumed_offset field.
>      > If presumed_offset is equal to the offset of the BO, the kernel will
>      > skip the relocation assuming that the value is already correct.
>      >
>      > Our initial implementation of relocation handling had a race where we
>      > would read bo->offset once when we wrote the relocation entry and
>      again
>      > when we filled out actual address.
>      >
>      > Found with helgrind
>      >
>      > Cc: "17.0 17.1" <[2]mesa-stable at lists.freedesktop.org>
>      > ---
>      >  src/intel/vulkan/anv_batch_chain.c | 21 +++++++++++++++++----
>      >  src/intel/vulkan/anv_private.h     |  2 +-
>      >  src/intel/vulkan/genX_blorp_exec.c |  5 ++++-
>      >  src/intel/vulkan/genX_cmd_buffer.c |  7 +++++--
>      >  4 files changed, 27 insertions(+), 8 deletions(-)
>      >
>      > diff --git a/src/intel/vulkan/anv_batch_chain.c
>      b/src/intel/vulkan/anv_batch_chain.c
>      > index 9def174..13303b1 100644
>      > --- a/src/intel/vulkan/anv_batch_chain.c
>      > +++ b/src/intel/vulkan/anv_batch_chain.c
>      > @@ -143,7 +143,8 @@ anv_reloc_list_grow(struct anv_reloc_list *list,
>      >  VkResult
>      >  anv_reloc_list_add(struct anv_reloc_list *list,
>      >                     const VkAllocationCallbacks *alloc,
>      > -                   uint32_t offset, struct anv_bo *target_bo,
>      uint32_t delta)
>      > +                   uint32_t offset, struct anv_bo *target_bo,
>      uint32_t delta,
>      > +                   uint64_t *bo_offset_out)
>      >  {
>      >     struct drm_i915_gem_relocation_entry *entry;
>      >     int index;
>      > @@ -155,6 +156,14 @@ anv_reloc_list_add(struct anv_reloc_list *list,
>      >     if (result != VK_SUCCESS)
>      >        return result;
>      >
>      > +   /* Read the BO offset once.  This same value will be used in the
>      relocation
>      > +    * entry and passed back to the caller for it to use when it
>      writes the
>      > +    * actual value.  This guarantees that the two values match even
>      if there
>      > +    * is a data race between now and when the caller gets around to
>      writing
>      > +    * the address into the BO.
>      > +    */
>      > +   uint64_t presumed_offset = target_bo->offset;
> 
>      If it really can be written in parallel and if helgrind says it is, it
>      is, then you also need a compiler barrier to prevent a reload.
> 
>      #define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
>      uint64_t presumed_offset = READ_ONCE(target_bo->offset);
> 
>    I thought about that but wasn't sure if it was actually an issue.  Happy
>    to add it.

It's more about documenting what the compiler isn't allowed to do,
anything not explicitly outlawed sooner or later it will try.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the mesa-dev mailing list