[Intel-gfx] [PATCH 11/15] drm/i915/execlists: Cancel banned contexts on schedule-out

Chris Wilson chris at chris-wilson.co.uk
Mon Oct 14 13:23:02 UTC 2019


Quoting Tvrtko Ursulin (2019-10-14 14:19:19)
> 
> On 14/10/2019 14:13, Chris Wilson wrote:
> > Quoting Chris Wilson (2019-10-14 13:34:35)
> >> Quoting Tvrtko Ursulin (2019-10-14 13:25:58)
> >>>
> >>> On 14/10/2019 13:06, Chris Wilson wrote:
> >>>> Quoting Tvrtko Ursulin (2019-10-14 13:00:01)
> >>>>>
> >>>>> On 14/10/2019 10:07, Chris Wilson wrote:
> >>>>>> +static void cancel_active(struct i915_request *rq,
> >>>>>> +                       struct intel_engine_cs *engine)
> >>>>>> +{
> >>>>>> +     struct intel_context * const ce = rq->hw_context;
> >>>>>> +
> >>>>>> +     /*
> >>>>>> +      * The executing context has been cancelled. Fixup the context so that
> >>>>>> +      * it will be marked as incomplete [-EIO] upon resubmission and not
> >>>
> >>> (read below first)
> >>>
> >>> ... and not misleadingly say "Fixup the context so that it will be
> >>> marked as incomplete" because there is nothing in this function which
> >>> does that. It mostly happens by the virtual of context already being
> >>> marked as banned somewhere else. This comment should just explain the
> >>> decision to rewind the ring->head for more determinism. It can still
> >>> mention canceling of user payload and -EIO. Just needs to be clear of
> >>> the single extra thing achieved here by the head rewind and context edit.
> >>
> >> I thought I was clear: "upon resubmission". So use a more active voice to
> >> reduce ambiguity, gotcha.
> > 
> >          /*
> >           * The executing context has been cancelled. We want to prevent
> >           * further execution along this context and propagate the error on
> >           * to anything depending on its results.
> >           *
> >           * In __i915_request_submit(), we apply the -EIO and remove the
> >           * requests payload for any banned requests. But first, we must
> >           * rewind the context back to the start of the incomplete request so
> >           * that we don't jump back into the middle of the batch.
> >           *
> >           * We preserve the breadcrumbs and semaphores of the incomplete
> >           * requests so that inter-timeline dependencies (i.e other timelines)
> >           * remain correctly ordered.
> >           */
> > 
> 
> Sounds good.
> 
> Btw.. does this work? :) If context was preempted in the middle of a 
> batch buffer there must be some other state saved (other than RING_HEAD) 
> which on context restore enables it to continue from the right place 
> *within* the batch. Is this code zapping that state as well so GPU will 
> fully forget it was inside the batch?

Yes. We are resetting the context image back to vanilla, and then
restore the ring registers to restart from this request. The selftests
are using spinning batches to simulate the banned context.
-Chris


More information about the Intel-gfx mailing list