[Mesa-dev] [PATCH v3] anv: handle failures when growing reloc lists
Pohjolainen, Topi
topi.pohjolainen at gmail.com
Thu Mar 16 09:27:47 UTC 2017
On Thu, Mar 16, 2017 at 08:56:30AM +0100, Iago Toral wrote:
> On Wed, 2017-03-15 at 16:56 +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 15, 2017 at 01:03:37PM +0100, Iago Toral Quiroga wrote:
> > >
> > > Growing the reloc list happens through calling anv_reloc_list_add()
> > > or
> > > anv_reloc_list_append(). Make sure that we call these through
> > > helpers
> > > that check the result and set the batch error status if needed.
> > >
> > > v2:
> > > - Handling the crashes is not good enough, we need to keep track
> > > of
> > > the error, for that, keep track of the errors in the batch
> > > instead (Jason).
> > > - Make reloc list growth go through helpers so we can have a
> > > central
> > > place where we can do error tracking (Jason).
> > >
> > > v3:
> > > - Callers that need the offset returned by anv_reloc_list_add()
> > > can
> > > compute it themselves since it is extracted from the inputs to
> > > the
> > > function, so change the function to return a VkResult, make
> > > anv_batch_emit_reloc() also return a VkResult and let their
> > > callers
> > > do the error management (Topi)
> > > ---
> > > src/intel/vulkan/anv_batch_chain.c | 40 +++++++++++++++++++++++++-
> > > ------------
> > > src/intel/vulkan/anv_private.h | 17 ++++++++++++----
> > > src/intel/vulkan/genX_blorp_exec.c | 20 +++++++++++++++----
> > > src/intel/vulkan/genX_cmd_buffer.c | 27 +++++++++++++------------
> > > 4 files changed, 71 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_batch_chain.c
> > > b/src/intel/vulkan/anv_batch_chain.c
> > > index c6fdfe5..95df0c9 100644
> > > --- a/src/intel/vulkan/anv_batch_chain.c
> > > +++ b/src/intel/vulkan/anv_batch_chain.c
> > > @@ -140,7 +140,7 @@ anv_reloc_list_grow(struct anv_reloc_list
> > > *list,
> > > return VK_SUCCESS;
> > > }
> > >
> > > -uint64_t
> > > +VkResult
> > > anv_reloc_list_add(struct anv_reloc_list *list,
> > > const VkAllocationCallbacks *alloc,
> > > uint32_t offset, struct anv_bo *target_bo,
> > > uint32_t delta)
> > > @@ -151,8 +151,9 @@ anv_reloc_list_add(struct anv_reloc_list *list,
> > > const uint32_t domain =
> > > target_bo->is_winsys_bo ? I915_GEM_DOMAIN_RENDER : 0;
> > >
> > > - anv_reloc_list_grow(list, alloc, 1);
> > > - /* TODO: Handle failure */
> > > + VkResult result = anv_reloc_list_grow(list, alloc, 1);
> > > + if (result != VK_SUCCESS)
> > > + return result;
> > >
> > > /* XXX: Can we use I915_EXEC_HANDLE_LUT? */
> > > index = list->num_relocs++;
> > > @@ -166,16 +167,17 @@ anv_reloc_list_add(struct anv_reloc_list
> > > *list,
> > > entry->write_domain = domain;
> > > VG(VALGRIND_CHECK_MEM_IS_DEFINED(entry, sizeof(*entry)));
> > >
> > > - return target_bo->offset + delta;
> > > + return VK_SUCCESS;
> > > }
> > >
> > > -static void
> > > +static VkResult
> > > anv_reloc_list_append(struct anv_reloc_list *list,
> > > const VkAllocationCallbacks *alloc,
> > > struct anv_reloc_list *other, uint32_t
> > > offset)
> > > {
> > > - anv_reloc_list_grow(list, alloc, other->num_relocs);
> > > - /* TODO: Handle failure */
> > > + VkResult result = anv_reloc_list_grow(list, alloc, other-
> > > >num_relocs);
> > > + if (result != VK_SUCCESS)
> > > + return result;
> > >
> > > memcpy(&list->relocs[list->num_relocs], &other->relocs[0],
> > > other->num_relocs * sizeof(other->relocs[0]));
> > > @@ -186,6 +188,7 @@ anv_reloc_list_append(struct anv_reloc_list
> > > *list,
> > > list->relocs[i + list->num_relocs].offset += offset;
> > >
> > > list->num_relocs += other->num_relocs;
> > > + return VK_SUCCESS;
> > > }
> > >
> > > /*--------------------------------------------------------------
> > > ---------*
> > > @@ -211,12 +214,19 @@ anv_batch_emit_dwords(struct anv_batch
> > > *batch, int num_dwords)
> > > return p;
> > > }
> > >
> > > -uint64_t
> > > +VkResult
> > > anv_batch_emit_reloc(struct anv_batch *batch,
> > > - void *location, struct anv_bo *bo, uint32_t
> > > delta)
> > > + void *location, struct anv_bo *bo, uint32_t
> > > delta,
> > > + uint64_t *offset)
> > > {
> > > - return anv_reloc_list_add(batch->relocs, batch->alloc,
> > > - location - batch->start, bo, delta);
> > > + VkResult result = anv_reloc_list_add(batch->relocs, batch-
> > > >alloc,
> > > + location - batch->start,
> > > bo, delta);
> > > + if (result != VK_SUCCESS)
> > > + return result;
> > I'm really sorry for the endless churn, but it looks to me that only
> > anv_reloc_list_add() needs to report the error. I for some reason
> > thought the
> > callers of anv_batch_emit_reloc() need to take individual steps upon
> > failure.
> > All they really do is set the error state which we can do already
> > here as
> > you had in the original.
> >
> > So here just:
> >
> > if (result != VK_SUCCESS) {
> > anv_batch_set_error(batch, result);
> > return 0;
> > }
> >
> > return bo->offset + delta;
> >
> > And leave _anv_combine_address() and blorp_emit_reloc() as they were.
> > Now that
> > you have written it out it becomes clearer to me.
>
> Sure, no worries.
>
> > >
> > > +
> > > + *offset = bo->offset + delta;
> > > +
> > > + return VK_SUCCESS;
> > > }
> > >
> > > void
> > > @@ -240,8 +250,12 @@ anv_batch_emit_batch(struct anv_batch *batch,
> > > struct anv_batch *other)
> > > memcpy(batch->next, other->start, size);
> > >
> > > offset = batch->next - batch->start;
> > > - anv_reloc_list_append(batch->relocs, batch->alloc,
> > > - other->relocs, offset);
> > > + VkResult result = anv_reloc_list_append(batch->relocs, batch-
> > > >alloc,
> > > + other->relocs, offset);
> > > + if (result != VK_SUCCESS) {
> > > + anv_batch_set_error(batch, result);
> > > + return;
> > > + }
> > >
> > > batch->next += size;
> > > }
> > > diff --git a/src/intel/vulkan/anv_private.h
> > > b/src/intel/vulkan/anv_private.h
> > > index 87538de..081bfb6 100644
> > > --- a/src/intel/vulkan/anv_private.h
> > > +++ b/src/intel/vulkan/anv_private.h
> > > @@ -673,7 +673,7 @@ VkResult anv_reloc_list_init(struct
> > > anv_reloc_list *list,
> > > void anv_reloc_list_finish(struct anv_reloc_list *list,
> > > const VkAllocationCallbacks *alloc);
> > >
> > > -uint64_t anv_reloc_list_add(struct anv_reloc_list *list,
> > > +VkResult anv_reloc_list_add(struct anv_reloc_list *list,
> > > const VkAllocationCallbacks *alloc,
> > > uint32_t offset, struct anv_bo
> > > *target_bo,
> > > uint32_t delta);
> > > @@ -717,8 +717,9 @@ struct anv_batch {
> > >
> > > void *anv_batch_emit_dwords(struct anv_batch *batch, int
> > > num_dwords);
> > > void anv_batch_emit_batch(struct anv_batch *batch, struct
> > > anv_batch *other);
> > > -uint64_t anv_batch_emit_reloc(struct anv_batch *batch,
> > > - void *location, struct anv_bo *bo,
> > > uint32_t offset);
> > > +VkResult anv_batch_emit_reloc(struct anv_batch *batch,
> > > + void *location, struct anv_bo *bo,
> > > + uint32_t delta, uint64_t *offset);
> > > VkResult anv_device_submit_simple_batch(struct anv_device *device,
> > > struct anv_batch *batch);
> > >
> > > @@ -751,7 +752,15 @@ _anv_combine_address(struct anv_batch *batch,
> > > void *location,
> > > } else {
> > > assert(batch->start <= location && location < batch->end);
> > >
> > > - return anv_batch_emit_reloc(batch, location, address.bo,
> > > address.offset + delta);
> > > + uint64_t offset;
> > > + VkResult result = anv_batch_emit_reloc(batch, location,
> > > address.bo,
> > > + address.offset +
> > > delta, &offset);
> > > + if (result != VK_SUCCESS) {
> > > + anv_batch_set_error(batch, result);
> > > + return 0;
> > > + }
> > > +
> > > + return offset;
> > > }
> > > }
> > >
> > > diff --git a/src/intel/vulkan/genX_blorp_exec.c
> > > b/src/intel/vulkan/genX_blorp_exec.c
> > > index 7084735..f775ce8 100644
> > > --- a/src/intel/vulkan/genX_blorp_exec.c
> > > +++ b/src/intel/vulkan/genX_blorp_exec.c
> > > @@ -48,8 +48,17 @@ blorp_emit_reloc(struct blorp_batch *batch,
> > > struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
> > > assert(cmd_buffer->batch.start <= location &&
> > > location < cmd_buffer->batch.end);
> > > - return anv_batch_emit_reloc(&cmd_buffer->batch, location,
> > > - address.buffer, address.offset +
> > > delta);
> > > +
> > > + uint64_t offset;
> > > + VkResult result =
> > > + anv_batch_emit_reloc(&cmd_buffer->batch, location,
> > > + address.buffer, address.offset + delta,
> > > &offset);
> > > + if (result != VK_SUCCESS) {
> > > + anv_batch_set_error(&cmd_buffer->batch, result);
> > > + return 0;
> > > + }
> > > +
> > > + return offset;
> > > }
> > >
> > > static void
> > > @@ -57,8 +66,11 @@ blorp_surface_reloc(struct blorp_batch *batch,
> > > uint32_t ss_offset,
> > > struct blorp_address address, uint32_t delta)
> > > {
> > > struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
> > > - anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > - ss_offset, address.buffer, address.offset +
> > > delta);
> > > + VkResult result =
> > > + anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > + ss_offset, address.buffer, address.offset
> > > + delta);
> > > + if (result != VK_SUCCESS)
> > > + anv_batch_set_error(&cmd_buffer->batch, result);
> > > }
> > >
> > > static void *
> > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > index e3faa17..6a759f4 100644
> > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > @@ -155,12 +155,19 @@
> > > genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer
> > > *cmd_buffer)
> > > static void
> > > add_surface_state_reloc(struct anv_cmd_buffer *cmd_buffer,
> > > struct anv_state state,
> > > + bool is_aux,
> > > struct anv_bo *bo, uint32_t offset)
> > > {
> > > const struct isl_device *isl_dev = &cmd_buffer->device-
> > > >isl_dev;
> > >
> > > - anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > - state.offset + isl_dev->ss.addr_offset, bo,
> > > offset);
> > > + uint32_t total_offset = state.offset +
> > > + (is_aux ? isl_dev->ss.aux_addr_offset : isl_dev-
> > > >ss.addr_offset);
> > > +
> > > + VkResult result =
> > > + anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > + total_offset, bo, offset);
> > > + if (result != VK_SUCCESS)
> > > + anv_batch_set_error(&cmd_buffer->batch, result);
> > > }
> > >
> > > static void
> > > @@ -171,9 +178,7 @@ add_image_view_relocs(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > {
> > > const struct isl_device *isl_dev = &cmd_buffer->device-
> > > >isl_dev;
> > >
> > > - anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > - state.offset + isl_dev->ss.addr_offset,
> > > - iview->bo, iview->offset);
> > > + add_surface_state_reloc(cmd_buffer, state, false, iview->bo,
> > > iview->offset);
> > I also think we could avoid changes to add_surface_state_reloc()
> > altogether
> > if handled the special case of aux below. So here just:
>
> I did this change because Jason suggested that we had helpers to call
> anv_reloc_list_add() so we can centralize error management a bit in
> those helpers, so in this case it looked like add_surface_state_reloc()
> was the natural place for it. That said, I don't think there is a
> strong reason to do it, however:
>
> > VkResult result = anv_reloc_list_add(
> > &cmd_buffer->surface_relocs,
> > &cmd_buffer->pool->alloc,
> > state.offset + isl_dev->ss.addr_offset,
> > iview->bo, iview->offset);
> > if (result != VK_SUCCESS) {
> > anv_batch_set_error(&cmd_buffer->batch, result);
> > return;
> > }
>
>
> I think we want to do error management in add_surface_state_reloc()
> because that is called from various other places and it calls
> anv_reloc_list_add, which can fail. In that case, this call to
> anv_reloc_list_add() here is exactly normal surface_reloc, so I think
> it makes sense to call add_surface_state_reloc() directly instead of
> pretty much exactly replicating it here.
Yeah, I forgot there are those other callers. Makes sense.
>
> > >
> > >
> > > if (aux_usage != ISL_AUX_USAGE_NONE) {
> > > uint32_t aux_offset = iview->offset + iview->image-
> > > >aux_surface.offset;
> > > @@ -186,9 +191,7 @@ add_image_view_relocs(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > uint32_t *aux_addr_dw = state.map + isl_dev-
> > > >ss.aux_addr_offset;
> > > aux_offset += *aux_addr_dw & 0xfff;
> > >
> > > - anv_reloc_list_add(&cmd_buffer->surface_relocs, &cmd_buffer-
> > > >pool->alloc,
> > > - state.offset + isl_dev-
> > > >ss.aux_addr_offset,
> > > - iview->bo, aux_offset);
> > And here:
> >
> > if (anv_reloc_list_add(&cmd_buffer->surface_relocs,
> > &cmd_buffer->pool->alloc,
> > state.offset + isl_dev-
> > >ss.aux_addr_offset,
> > iview->bo, aux_offset) != VK_SUCCESS)
> > anv_batch_set_error(&cmd_buffer->batch, result);
>
> Yes, we can do this. We would remove the "is_aux" parameter from
> add_surface_state_reloc() which was a bit forced maybe and since this
> is the only place where we needed it I don't think it is a big deal to
> handle it separately.
>
> Does this sound reasonable?
Sounds good, thanks :)
More information about the mesa-dev
mailing list