[Intel-gfx] [PATCH] drm/i915/preemption: Allow preemption between submission ports

Fri Feb 23 14:23:08 UTC 2018

Quoting Mika Kuoppala (2018-02-23 14:06:06)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> >  static inline bool is_high_priority(struct intel_guc_client *client)
> >  {
> >       return (client->priority == GUC_CLIENT_PRIORITY_KMD_HIGH ||
> > @@ -682,15 +687,12 @@ static void guc_dequeue(struct intel_engine_cs *engine)
> >       rb = execlists->first;
> >       GEM_BUG_ON(rb_first(&execlists->queue) != rb);
> >  
> > -     if (!rb)
> > -             goto unlock;
> > -
> >       if (port_isset(port)) {
> >               if (engine->i915->preempt_context) {
> >                       struct guc_preempt_work *preempt_work =
> >                               &engine->i915->guc.preempt_work[engine->id];
> >  
> > -                     if (rb_entry(rb, struct i915_priolist, node)->priority >
> > +                     if (execlists->queue_priority >
> >                           max(port_request(port)->priotree.priority, 0)) {
> >                               execlists_set_active(execlists,
> >                                                    EXECLISTS_ACTIVE_PREEMPT);
> 
> This is the priority inversion part? We preempt and clear the ports
> to rearrange if the last port has a higher priority request?

Yes, along with a strong kick from

> > @@ -1050,8 +1068,9 @@ static void execlists_schedule(struct i915_request *request, int prio)
> >               pt->priority = prio;
> >               if (!list_empty(&pt->link)) {
> >                       __list_del_entry(&pt->link);
> > -                     insert_request(engine, pt, prio);
> > +                     queue_request(engine, pt, prio);
> >               }
> > +             submit_queue(engine, prio);

So that we re-evaluate ELSP if the active prio change.

> > @@ -264,7 +281,7 @@ lookup_priolist(struct intel_engine_cs *engine,
> >       if (first)
> >               execlists->first = &p->node;
> >  
> > -     return ptr_pack_bits(p, first, 1);
> > +     return p;
> 
> Hmm there is no need for decode first as we always resubmit
> the queue depending on the prio?

Right, the first bit is now checked against queue_priority instead. If
we are higher priority than the queue, we must rerun the tasklet.
Whereas before we knew we only had to do it if we inserted into the
start of the queue.

> > @@ -453,12 +467,17 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
> >                  _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT |
> >                                     CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT));
> >  
> > +     /*
> > +      * Switch to our empty preempt context so
> > +      * the state of the GPU is known (idle).
> > +      */
> >       GEM_TRACE("%s\n", engine->name);
> >       for (n = execlists_num_ports(&engine->execlists); --n; )
> >               elsp_write(0, engine->execlists.elsp);
> >  
> >       elsp_write(ce->lrc_desc, engine->execlists.elsp);
> >       execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
> > +     execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT);
> 
> Surely a better place. Now looking at this would it be more prudent to
> move both the clear and set right before the last elsp write?
> 
> Well I guess it really doesn't matter as we hold the timeline lock.

And we only process ELSP from a single cpu, so it's all sequential,
right.
-Chris