[Intel-gfx] [PATCH 07/10] drm/i915/execlists: Cancel banned contexts on schedule-out

Fri Oct 11 10:15:58 UTC 2019

Quoting Tvrtko Ursulin (2019-10-11 10:47:26)
> > +static void
> > +mark_complete(struct i915_request *rq, struct intel_engine_cs *engine)
> > +{
> > +     const struct intel_timeline * const tl = rcu_dereference(rq->timeline);
> > +
> > +     *(u32 *)tl->hwsp_seqno = rq->fence.seqno;
> > +     GEM_BUG_ON(!i915_request_completed(rq));
> > +
> > +     list_for_each_entry_from_reverse(rq, &tl->requests, link) {
> > +             if (i915_request_signaled(rq))
> > +                     break;
> > +
> > +             mark_eio(rq);
> 
> This would -EIO requests which have potentially be completed but not 
> retired yet? If so why?

Hmm. That's a bit of an oversight, yes.

> > +     }
> > +
> > +     intel_engine_queue_breadcrumbs(engine);
> > +}
> > +
> > +static void cancel_active(struct i915_request *rq,
> > +                       struct intel_engine_cs *engine)
> > +{
> > +     struct intel_context * const ce = rq->hw_context;
> > +     u32 *regs = ce->lrc_reg_state;
> > +
> > +     if (i915_request_completed(rq))
> > +             return;
> > +
> > +     GEM_TRACE("%s(%s): { rq=%llx:%lld }\n",
> > +               __func__, engine->name, rq->fence.context, rq->fence.seqno);
> > +     __context_pin_acquire(ce);
> > +
> > +     /* Scrub the context image to prevent replaying the previous batch */
> > +     memcpy(regs, /* skip restoring the vanilla PPHWSP */
> > +            engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
> > +            engine->context_size - PAGE_SIZE);
> 
> context_size - LRC_STATE_PN * PAGE_SIZE ?

context_size excludes the guc header pages, so it's a bit of a kerfuffle.

> > +     execlists_init_reg_state(regs, ce, engine, ce->ring, false);
> > +
> > +     /* Ring will be advanced on retire; here we need to reset the context */
> > +     ce->ring->head = intel_ring_wrap(ce->ring, rq->wa_tail);
> > +     __execlists_update_reg_state(ce, engine);
> > +
> > +     /* We've switched away, so this should be a no-op, but intent matters */
> > +     ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
> > +
> > +     /* Let everyone know that the request may now be retired */
> > +     rcu_read_lock();
> > +     mark_complete(rq, engine);
> > +     rcu_read_unlock();
> > +
> > +     __context_pin_release(ce);
> > +}
> > +
> >   static inline void
> >   __execlists_schedule_out(struct i915_request *rq,
> >                        struct intel_engine_cs * const engine)
> > @@ -1032,6 +1087,9 @@ __execlists_schedule_out(struct i915_request *rq,
> >       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> >       intel_gt_pm_put(engine->gt);
> >   
> > +     if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
> > +             cancel_active(rq, engine);
> 
> Or you are counting this is already the last runnable request from this 
> context due coalescing? It wouldn't work if for any reason coalescing 
> would be prevented. Either with GVT, or I had some ideas to prevent 
> coalescing for contexts where watchdog is enabled in the future. In 
> which case this would be a hidden gotcha. Maybe all that's needed in 
> mark_complete is also to look towards the end of the list?

I'm not following. We are looking at the context here, which is track by
the last request submitted for that context.
-Chris