[Intel-gfx] [PATCH 1/6] drm/i915: Limit C-states when waiting for the active request

Mon Aug 6 09:59:40 UTC 2018

Quoting Tvrtko Ursulin (2018-08-06 10:34:54)
> 
> On 06/08/2018 09:30, Chris Wilson wrote:
> > If we are waiting for the currently executing request, we have a good
> > idea that it will be completed in the very near future and so want to
> > cap the CPU_DMA_LATENCY to ensure that we wake up the client quickly.
> 
> I cannot shake the opinion that we shouldn't be doing this. For instance 
> what if the client has been re-niced (down), or it has re-niced itself? 
> Obviously wrong to apply this for those.

Niceness only restricts its position on the scheduler runqueue, doesn't
actually have any cpufreq implications (give or take RT heuristics).
So I don't think we need a tsk->prio restriction.

> Or when you say we have a good idea something will be completed in the 
> very near future. Say there is a 60fps workload which is sending 5ms 
> batches and waits on them. That would be 30% of time spent outside of 
> low C states for a workload which doesn't need it.

Quite frankly, they shouldn't be using wait on the current frame. For
example, in mesa you wait for the end of the previous frame which should
be roughly complete, and since it is a stall before computing the next,
latency is still important.

> Also having read what the OpenCL does, where they want to apply 
> different wait optimisations for different call-sites, the idea that we 
> should instead be introducing a low-latency flag to wait ioctl sounds 
> more appropriate.

I'm not impressed by what I've heard there yet. There's also the
dilemma with what to do with dma-fence poll().

> > +             if (!qos &&
> > +                 i915_seqno_passed(intel_engine_get_seqno(rq->engine),
> > +                                   wait.seqno - 1))
> 
> I also realized that this will get incorrectly applied when there is 
> preemption. If a low-priority request gets preempted after we applied 
> the PM QoS it will persist for much longer than intended. (Until the 
> high-prio request completes and then low-prio one.) And the explicit 
> low-latency wait flag would have the same problem. We could perhaps go 
> with removing the PM QoS request if preempted. It should not be frequent 
> enough to cause issue with too much traffic on the API. But

Sure, I didn't think it was worth worrying about. We could cancel it and
reset it on next execution.

> Another side note - quick grep shows there are a few other "seqno - 1" 
> callsites so perhaps we should add a helper for this with a more 
> self-explanatory like __i915_seqno_is_executing(engine, seqno) or something?

I briefly considered something along those lines,
intel_engine_has_signaled(), intel_engine_has_started. I also noticed
that I didn't kill i915_request_started even though I though we had.
-Chris