[Mesa-dev] [PATCH v2 10/24] anv: handle failures when growing reloc lists

Iago Toral itoral at igalia.com
Wed Mar 15 07:39:48 UTC 2017


On Wed, 2017-03-15 at 09:35 +0200, Pohjolainen, Topi wrote:
> On Wed, Mar 15, 2017 at 08:23:35AM +0100, Iago Toral wrote:
> > 
> > On Tue, 2017-03-14 at 16:14 +0200, Pohjolainen, Topi wrote:
> > > 
> > > On Tue, Mar 14, 2017 at 04:02:17PM +0200, Pohjolainen, Topi
> > > wrote:
> > > > 
> > > > 
> > > > On Tue, Mar 14, 2017 at 02:08:10PM +0100, Iago Toral wrote:
> > > > > 
> > > > > 
> > > > > On Tue, 2017-03-14 at 14:04 +0200, Pohjolainen, Topi wrote:
> > > > > > 
> > > > > > 
> > > > > > On Tue, Mar 14, 2017 at 12:06:16PM +0100, Iago Toral wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On Tue, 2017-03-14 at 12:01 +0100, Iago Toral wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Tue, 2017-03-14 at 11:25 +0200, Pohjolainen, Topi
> > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Fri, Mar 10, 2017 at 01:38:23PM +0100, Iago Toral
> > > > > > > > > Quiroga
> > > > > > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Growing the reloc list happens through calling
> > > > > > > > > > anv_reloc_list_add()
> > > > > > > > > > or
> > > > > > > > > > anv_reloc_list_append(). Make sure that we call
> > > > > > > > > > these
> > > > > > > > > > through
> > > > > > > > > > helpers
> > > > > > > > > > that check the result and set the batch error
> > > > > > > > > > status if
> > > > > > > > > > needed.
> > > > > > > > > > 
> > > > > > > > > > v2:
> > > > > > > > > >   - Handling the crashes is not good enough, we
> > > > > > > > > > need to
> > > > > > > > > > keep
> > > > > > > > > > track
> > > > > > > > > > of
> > > > > > > > > >     the error, for that, keep track of the errors
> > > > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > batch
> > > > > > > > > > instead (Jason).
> > > > > > > > > >   - Make reloc list growth go through helpers so we
> > > > > > > > > > can
> > > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > central
> > > > > > > > > >     place where we can do error tracking (Jason).
> > > > > > > > > > ---
> > > > > > > > > >  src/intel/vulkan/anv_batch_chain.c | 29
> > > > > > > > > > ++++++++++++++++++++----
> > > > > > > > > > -----
> > > > > > > > > >  src/intel/vulkan/genX_blorp_exec.c |  7 +++++--
> > > > > > > > > >  src/intel/vulkan/genX_cmd_buffer.c | 25
> > > > > > > > > > ++++++++++++++----
> > > > > > > > > > ----
> > > > > > > > > > ---
> > > > > > > > > >  3 files changed, 39 insertions(+), 22 deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/src/intel/vulkan/anv_batch_chain.c
> > > > > > > > > > b/src/intel/vulkan/anv_batch_chain.c
> > > > > > > > > > index 3774172..2add5bd 100644
> > > > > > > > > > --- a/src/intel/vulkan/anv_batch_chain.c
> > > > > > > > > > +++ b/src/intel/vulkan/anv_batch_chain.c
> > > > > > > > > > @@ -151,8 +151,9 @@ anv_reloc_list_add(struct
> > > > > > > > > > anv_reloc_list
> > > > > > > > > > *list,
> > > > > > > > > >     const uint32_t domain =
> > > > > > > > > >        target_bo->is_winsys_bo ?
> > > > > > > > > > I915_GEM_DOMAIN_RENDER
> > > > > > > > > > : 0;
> > > > > > > > > >  
> > > > > > > > > > -   anv_reloc_list_grow(list, alloc, 1);
> > > > > > > > > > -   /* TODO: Handle failure */
> > > > > > > > > > +   VkResult result = anv_reloc_list_grow(list,
> > > > > > > > > > alloc,
> > > > > > > > > > 1);
> > > > > > > > > > +   if (result != VK_SUCCESS)
> > > > > > > > > > +      return 0;
> > > > > > > > > None of the callers of anv_reloc_list_add() or
> > > > > > > > > anv_reloc_list_append() are
> > > > > > > > > currently interested of the return value.
> > > > > > > > and_reloc_list_append() already returns a VkResult.
> > > > > > > > 
> > > > > > > > As for the callers of anv_reloc_list_add(),
> > > > > > > > anv_batch_emit_reloc()
> > > > > > > > needs the offset computed by anv_reloc_list_add(), but
> > > > > > > > that
> > > > > > > > seems
> > > > > > > > to
> > > > > > > > be
> > > > > > > > the only one, so we could do what you suggest below.
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >  And if they were, they could
> > > > > > > > > easily calculate it themselves: both target_bo and
> > > > > > > > > delta
> > > > > > > > > are
> > > > > > > > > inputs.
> > > > > > > > > So instead of relying on zero offset indicating error
> > > > > > > > > we
> > > > > > > > > could
> > > > > > > > > simply
> > > > > > > > > change the return type to VkResult. Or did I miss
> > > > > > > > > something?
> > > > > > > > I think this should be fine.
> > > > > > > Well, except that anv_batch_emit_reloc needs to return
> > > > > > > that
> > > > > > > offset
> > > > > > > to
> > > > > > > its callers, so even if receives a VkResult and computes
> > > > > > > the
> > > > > > > offset
> > > > > > > itself, in the case of an error we still need to return
> > > > > > > an
> > > > > > > offset,
> > > > > > > so I
> > > > > > > don't think there is much gain in doing this, right?
> > > > > > Oh, I missed that. Yeah, then it doesn't work. I'm just not
> > > > > > super
> > > > > > happy
> > > > > > relying on offset == 0 indicating failure. Does the diff
> > > > > > get
> > > > > > out hand
> > > > > > if 
> > > > > > provided offset as outgoing argument (and returning
> > > > > > VkResult)?
> > > > > Right I don't particularly like this either, however, I am
> > > > > not
> > > > > sure we
> > > > > can do much about it. This is called from two places:
> > > > > 
> > > > > 1) _blorp_combine_address() -> blorp_emit_reloc()
> > > > > -> anv_batch_emit_reloc()
> > > > > 
> > > > > 2) _anv_combine_address() -> anv_batch_emit_reloc()
> > > > > 
> > > > > _blorp_combine_address and _anv_combine_address are wrapped
> > > > > by
> > > > > the
> > > > > _gen_combine_address macros that are used all over the place
> > > > > in
> > > > > auto-
> > > > > generated files to fill dwords in various state packets.
> > > Couldn't we still make anv_batch_emit_reloc() return VkResult and
> > > to
> > > provide
> > > the offset as argument. Those two callers, _anv_combine_address()
> > > and
> > > blorp_emit_reloc(), could set the error state and like you said
> > > they
> > > could
> > > return zeros as it doesn't matter what gets filled into the
> > > batch.
> > > 
> > > Would that work?
> > Yes, we can do that,  although I don't see why that would make
> > things
> > better: we still need to do the same, only that now we do it in
> > those
> > two callers rather than in the function they call, I understand
> > that
> > you see a benefit in this that I am missing?
> I suppose it gets subtle. This way we wouldn't rely on offset == 0
> indicating
> error. Error handlers, blorp_emit_reloc() and _anv_combine_address(),
> both
> get VkResult instead of deducing the error from offset == 0. In case
> of error
> they can pass on whatever they like. Their callers don't care about
> the value
> of the offset - they just write it to batch.

Right, I see your point and I agree we make the error handling more
explicit this way. I'll do this change, thanks!

> I just don't want to end up in a situation later on when somebody has
> a case
> where zero offset is valid and we need to re-structure the error
> handling.
> 
> > 
> > 
> > In any case I have no objections, so if you like that better I am
> > happy
> > to do it that way.
> > 
> > > 
> > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > So the thing is, these state packets need to be filled one
> > > > > way or
> > > > > another. If we don't they will just have garbage, so in case
> > > > > of
> > > > > failure, writing 0 does not sound particularly bad to me, the
> > > > > focus
> > > > > needs to be on avoiding to execute things that are known to
> > > > > be
> > > > > broken,
> > > > > which is what we are trying to do with this series.
> > > > > 
> > > > > In other words, we could modify the functions in those chains
> > > > > above to
> > > > > return a VkResult, that should not be a big change, however,
> > > > > eventually, the generator script would have to do something
> > > > > depending
> > > > > on the result: if it is a failure there is not much we can
> > > > > do, we
> > > > > could
> > > > > skip writing the state packet, but that would only leave
> > > > > garbage,
> > > > > or we
> > > > > could decide to write it to 0, but then all this change would
> > > > > be
> > > > > for
> > > > > nothing anyway.
> > > > > 
> > > > > Also, I count  ~380 call sites of the _gen_combine_address
> > > > > macro
> > > > > in the
> > > > > autogenerated files, so for each call site, the generator
> > > > > would
> > > > > generate code that:
> > > > > 
> > > > > 1. declares a variable so it can pass a pointer through that
> > > > > chain of
> > > > > functions.
> > > > > 2. Checks the result
> > > > > 3. Depending on the result, decides what to write to the
> > > > > state
> > > > > packet.
> > > > > 
> > > > > to me it doesn't look worth the trouble when there is not a
> > > > > clear
> > > > > gain...
> > > > I agree, it gets totally out of hand. I'll take one more look
> > > > if
> > > > there are
> > > > any alternatives.
> > > > 


More information about the mesa-dev mailing list