[Mesa-dev] [PATCH v3] anv: handle failures when growing reloc lists

Pohjolainen, Topi topi.pohjolainen at gmail.com
Thu Mar 16 09:27:47 UTC 2017


On Thu, Mar 16, 2017 at 08:56:30AM +0100, Iago Toral wrote:
> On Wed, 2017-03-15 at 16:56 +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 15, 2017 at 01:03:37PM +0100, Iago Toral Quiroga wrote:
> > > 
> > > Growing the reloc list happens through calling anv_reloc_list_add()
> > > or
> > > anv_reloc_list_append(). Make sure that we call these through
> > > helpers
> > > that check the result and set the batch error status if needed.
> > > 
> > > v2:
> > >   - Handling the crashes is not good enough, we need to keep track
> > > of
> > >     the error, for that, keep track of the errors in the batch
> > > instead (Jason).
> > >   - Make reloc list growth go through helpers so we can have a
> > > central
> > >     place where we can do error tracking (Jason).
> > > 
> > > v3:
> > >   - Callers that need the offset returned by anv_reloc_list_add()
> > > can
> > >     compute it themselves since it is extracted from the inputs to
> > > the
> > >     function, so change the function to return a VkResult, make
> > >     anv_batch_emit_reloc() also return a VkResult and let their
> > > callers
> > >     do the error management (Topi)
> > > ---
> > >  src/intel/vulkan/anv_batch_chain.c | 40 +++++++++++++++++++++++++-
> > > ------------
> > >  src/intel/vulkan/anv_private.h     | 17 ++++++++++++----
> > >  src/intel/vulkan/genX_blorp_exec.c | 20 +++++++++++++++----
> > >  src/intel/vulkan/genX_cmd_buffer.c | 27 +++++++++++++------------
> > >  4 files changed, 71 insertions(+), 33 deletions(-)
> > > 
> > > diff --git a/src/intel/vulkan/anv_batch_chain.c
> > > b/src/intel/vulkan/anv_batch_chain.c
> > > index c6fdfe5..95df0c9 100644
> > > --- a/src/intel/vulkan/anv_batch_chain.c
> > > +++ b/src/intel/vulkan/anv_batch_chain.c
> > > @@ -140,7 +140,7 @@ anv_reloc_list_grow(struct anv_reloc_list
> > > *list,
> > >     return VK_SUCCESS;
> > >  }
> > >  
> > > -uint64_t
> > > +VkResult
> > >  anv_reloc_list_add(struct anv_reloc_list *list,
> > >                     const VkAllocationCallbacks *alloc,
> > >                     uint32_t offset, struct anv_bo *target_bo,
> > > uint32_t delta)
> > > @@ -151,8 +151,9 @@ anv_reloc_list_add(struct anv_reloc_list *list,
> > >     const uint32_t domain =
> > >        target_bo->is_winsys_bo ? I915_GEM_DOMAIN_RENDER : 0;
> > >  
> > > -   anv_reloc_list_grow(list, alloc, 1);
> > > -   /* TODO: Handle failure */
> > > +   VkResult result = anv_reloc_list_grow(list, alloc, 1);
> > > +   if (result != VK_SUCCESS)
> > > +      return result;
> > >  
> > >     /* XXX: Can we use I915_EXEC_HANDLE_LUT? */
> > >     index = list->num_relocs++;
> > > @@ -166,16 +167,17 @@ anv_reloc_list_add(struct anv_reloc_list
> > > *list,
> > >     entry->write_domain = domain;
> > >     VG(VALGRIND_CHECK_MEM_IS_DEFINED(entry, sizeof(*entry)));
> > >  
> > > -   return target_bo->offset + delta;
> > > +   return VK_SUCCESS;
> > >  }
> > >  
> > > -static void
> > > +static VkResult
> > >  anv_reloc_list_append(struct anv_reloc_list *list,
> > >                        const VkAllocationCallbacks *alloc,
> > >                        struct anv_reloc_list *other, uint32_t
> > > offset)
> > >  {
> > > -   anv_reloc_list_grow(list, alloc, other->num_relocs);
> > > -   /* TODO: Handle failure */
> > > +   VkResult result = anv_reloc_list_grow(list, alloc, other-
> > > >num_relocs);
> > > +   if (result != VK_SUCCESS)
> > > +      return result;
> > >  
> > >     memcpy(&list->relocs[list->num_relocs], &other->relocs[0],
> > >            other->num_relocs * sizeof(other->relocs[0]));
> > > @@ -186,6 +188,7 @@ anv_reloc_list_append(struct anv_reloc_list
> > > *list,
> > >        list->relocs[i + list->num_relocs].offset += offset;
> > >  
> > >     list->num_relocs += other->num_relocs;
> > > +   return VK_SUCCESS;
> > >  }
> > >  
> > >  /*--------------------------------------------------------------
> > > ---------*
> > > @@ -211,12 +214,19 @@ anv_batch_emit_dwords(struct anv_batch
> > > *batch, int num_dwords)
> > >     return p;
> > >  }
> > >  
> > > -uint64_t
> > > +VkResult
> > >  anv_batch_emit_reloc(struct anv_batch *batch,
> > > -                     void *location, struct anv_bo *bo, uint32_t
> > > delta)
> > > +                     void *location, struct anv_bo *bo, uint32_t
> > > delta,
> > > +                     uint64_t *offset)
> > >  {
> > > -   return anv_reloc_list_add(batch->relocs, batch->alloc,
> > > -                             location - batch->start, bo, delta);
> > > +   VkResult result = anv_reloc_list_add(batch->relocs, batch-
> > > >alloc,
> > > +                                        location - batch->start,
> > > bo, delta);
> > > +   if (result != VK_SUCCESS)
> > > +      return result;
> > I'm really sorry for the endless churn, but it looks to me that only
> > anv_reloc_list_add() needs to report the error. I for some reason
> > thought the
> > callers of anv_batch_emit_reloc() need to take individual steps upon
> > failure.
> > All they really do is set the error state which we can do already
> > here as
> > you had in the original.
> > 
> > So here just:
> > 
> >       if (result != VK_SUCCESS) {
> >          anv_batch_set_error(batch, result);
> >          return 0;
> >       }
> > 
> >       return bo->offset + delta;
> > 
> > And leave _anv_combine_address() and blorp_emit_reloc() as they were.
> > Now that
> > you have written it out it becomes clearer to me.
> 
> Sure, no worries.
> 
> > > 
> > > +
> > > +   *offset = bo->offset + delta;
> > > +
> > > +   return VK_SUCCESS;
> > >  }
> > >  
> > >  void
> > > @@ -240,8 +250,12 @@ anv_batch_emit_batch(struct anv_batch *batch,
> > > struct anv_batch *other)
> > >     memcpy(batch->next, other->start, size);
> > >  
> > >     offset = batch->next - batch->start;
> > > -   anv_reloc_list_append(batch->relocs, batch->alloc,
> > > -                         other->relocs, offset);
> > > +   VkResult result = anv_reloc_list_append(batch->relocs, batch-
> > > >alloc,
> > > +                                           other->relocs, offset);
> > > +   if (result != VK_SUCCESS) {
> > > +      anv_batch_set_error(batch, result);
> > > +      return;
> > > +   }
> > >  
> > >     batch->next += size;
> > >  }
> > > diff --git a/src/intel/vulkan/anv_private.h
> > > b/src/intel/vulkan/anv_private.h
> > > index 87538de..081bfb6 100644
> > > --- a/src/intel/vulkan/anv_private.h
> > > +++ b/src/intel/vulkan/anv_private.h
> > > @@ -673,7 +673,7 @@ VkResult anv_reloc_list_init(struct
> > > anv_reloc_list *list,
> > >  void anv_reloc_list_finish(struct anv_reloc_list *list,
> > >                             const VkAllocationCallbacks *alloc);
> > >  
> > > -uint64_t anv_reloc_list_add(struct anv_reloc_list *list,
> > > +VkResult anv_reloc_list_add(struct anv_reloc_list *list,
> > >                              const VkAllocationCallbacks *alloc,
> > >                              uint32_t offset, struct anv_bo
> > > *target_bo,
> > >                              uint32_t delta);
> > > @@ -717,8 +717,9 @@ struct anv_batch {
> > >  
> > >  void *anv_batch_emit_dwords(struct anv_batch *batch, int
> > > num_dwords);
> > >  void anv_batch_emit_batch(struct anv_batch *batch, struct
> > > anv_batch *other);
> > > -uint64_t anv_batch_emit_reloc(struct anv_batch *batch,
> > > -                              void *location, struct anv_bo *bo,
> > > uint32_t offset);
> > > +VkResult anv_batch_emit_reloc(struct anv_batch *batch,
> > > +                              void *location, struct anv_bo *bo,
> > > +                              uint32_t delta, uint64_t *offset);
> > >  VkResult anv_device_submit_simple_batch(struct anv_device *device,
> > >                                          struct anv_batch *batch);
> > >  
> > > @@ -751,7 +752,15 @@ _anv_combine_address(struct anv_batch *batch,
> > > void *location,
> > >     } else {
> > >        assert(batch->start <= location && location < batch->end);
> > >  
> > > -      return anv_batch_emit_reloc(batch, location, address.bo,
> > > address.offset + delta);
> > > +      uint64_t offset;
> > > +      VkResult result = anv_batch_emit_reloc(batch, location,
> > > address.bo,
> > > +                                             address.offset +
> > > delta, &offset);
> > > +      if (result != VK_SUCCESS) {
> > > +         anv_batch_set_error(batch, result);
> > > +         return 0;
> > > +      }
> > > +
> > > +      return offset;
> > >     }
> > >  }
> > >  
> > > diff --git a/src/intel/vulkan/genX_blorp_exec.c
> > > b/src/intel/vulkan/genX_blorp_exec.c
> > > index 7084735..f775ce8 100644
> > > --- a/src/intel/vulkan/genX_blorp_exec.c
> > > +++ b/src/intel/vulkan/genX_blorp_exec.c
> > > @@ -48,8 +48,17 @@ blorp_emit_reloc(struct blorp_batch *batch,
> > >     struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
> > >     assert(cmd_buffer->batch.start <= location &&
> > >            location < cmd_buffer->batch.end);
> > > -   return anv_batch_emit_reloc(&cmd_buffer->batch, location,
> > > -                               address.buffer, address.offset +
> > > delta);
> > > +
> > > +   uint64_t offset;
> > > +   VkResult result =
> > > +      anv_batch_emit_reloc(&cmd_buffer->batch, location,
> > > +                           address.buffer, address.offset + delta,
> > > &offset);
> > > +   if (result != VK_SUCCESS) {
> > > +      anv_batch_set_error(&cmd_buffer->batch, result);
> > > +      return 0;
> > > +   }
> > > +
> > > +   return offset;
> > >  }
> > >  
> > >  static void
> > > @@ -57,8 +66,11 @@ blorp_surface_reloc(struct blorp_batch *batch,
> > > uint32_t ss_offset,
> > >                      struct blorp_address address, uint32_t delta)
> > >  {
> > >     struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
> > > -   anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > -                      ss_offset, address.buffer, address.offset +
> > > delta);
> > > +   VkResult result =
> > > +      anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > +                         ss_offset, address.buffer, address.offset
> > > + delta);
> > > +   if (result != VK_SUCCESS)
> > > +      anv_batch_set_error(&cmd_buffer->batch, result);
> > >  }
> > >  
> > >  static void *
> > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > index e3faa17..6a759f4 100644
> > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > @@ -155,12 +155,19 @@
> > > genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer
> > > *cmd_buffer)
> > >  static void
> > >  add_surface_state_reloc(struct anv_cmd_buffer *cmd_buffer,
> > >                          struct anv_state state,
> > > +                        bool is_aux,
> > >                          struct anv_bo *bo, uint32_t offset)
> > >  {
> > >     const struct isl_device *isl_dev = &cmd_buffer->device-
> > > >isl_dev;
> > >  
> > > -   anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > -                      state.offset + isl_dev->ss.addr_offset, bo,
> > > offset);
> > > +   uint32_t total_offset = state.offset +
> > > +      (is_aux ? isl_dev->ss.aux_addr_offset : isl_dev-
> > > >ss.addr_offset);
> > > +
> > > +   VkResult result =
> > > +      anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > +                         total_offset, bo, offset);
> > > +   if (result != VK_SUCCESS)
> > > +      anv_batch_set_error(&cmd_buffer->batch, result);
> > >  }
> > >  
> > >  static void
> > > @@ -171,9 +178,7 @@ add_image_view_relocs(struct anv_cmd_buffer
> > > *cmd_buffer,
> > >  {
> > >     const struct isl_device *isl_dev = &cmd_buffer->device-
> > > >isl_dev;
> > >  
> > > -   anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > -                      state.offset + isl_dev->ss.addr_offset,
> > > -                      iview->bo, iview->offset);
> > > +   add_surface_state_reloc(cmd_buffer, state, false, iview->bo,
> > > iview->offset);
> > I also think we could avoid changes to add_surface_state_reloc()
> > altogether
> > if handled the special case of aux below. So here just:
> 
> I did this change because Jason suggested that we had helpers to call
> anv_reloc_list_add() so we can centralize error management a bit in
> those helpers, so in this case it looked like add_surface_state_reloc()
> was the natural place for it. That said, I don't think there is a
> strong reason to do it, however:
> 
> >       VkResult result = anv_reloc_list_add(
> >                            &cmd_buffer->surface_relocs,
> >                            &cmd_buffer->pool->alloc,
> >                            state.offset + isl_dev->ss.addr_offset,
> >                            iview->bo, iview->offset);
> >       if (result != VK_SUCCESS) {
> >          anv_batch_set_error(&cmd_buffer->batch, result);
> >          return;
> >       }
> 
> 
> I think we want to do error management in add_surface_state_reloc()
> because that is called from various other places and it calls
> anv_reloc_list_add, which can fail. In that case, this call to
> anv_reloc_list_add() here is exactly normal surface_reloc, so I think
> it makes sense to call add_surface_state_reloc() directly instead of
> pretty much exactly replicating it here.

Yeah, I forgot there are those other callers. Makes sense.

> 
> > > 
> > >  
> > >     if (aux_usage != ISL_AUX_USAGE_NONE) {
> > >        uint32_t aux_offset = iview->offset + iview->image-
> > > >aux_surface.offset;
> > > @@ -186,9 +191,7 @@ add_image_view_relocs(struct anv_cmd_buffer
> > > *cmd_buffer,
> > >        uint32_t *aux_addr_dw = state.map + isl_dev-
> > > >ss.aux_addr_offset;
> > >        aux_offset += *aux_addr_dw & 0xfff;
> > >  
> > > -      anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > -                         state.offset + isl_dev-
> > > >ss.aux_addr_offset,
> > > -                         iview->bo, aux_offset);
> > And here:
> > 
> >       if (anv_reloc_list_add(&cmd_buffer->surface_relocs,
> >                              &cmd_buffer->pool->alloc,
> >                              state.offset + isl_dev-
> > >ss.aux_addr_offset,
> >                              iview->bo, aux_offset) != VK_SUCCESS)
> >          anv_batch_set_error(&cmd_buffer->batch, result);
> 
> Yes, we can do this. We would remove the "is_aux" parameter from
> add_surface_state_reloc() which was a bit forced maybe and since this
> is the only place where we needed it I don't think it is a big deal to
> handle it separately.
> 
> Does this sound reasonable?

Sounds good, thanks :)


More information about the mesa-dev mailing list