[Intel-gfx] [PATCH 1/5] drm/i915: Track per-context engine busyness

Thu Jan 30 18:05:03 UTC 2020

On 16/12/2019 13:09, Tvrtko Ursulin wrote:
> 
> On 16/12/2019 12:40, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2019-12-16 12:07:00)
>>> @@ -1389,6 +1415,9 @@ static void execlists_submit_ports(struct 
>>> intel_engine_cs *engine)
>>>                  write_desc(execlists,
>>>                             rq ? execlists_update_context(rq) : 0,
>>>                             n);
>>> +
>>> +               if (n == 0)
>>> +                       
>>> intel_context_stats_start(&rq->hw_context->stats);
>>
>> Too early? (Think preemption requests that may not begin for a few
>> hundred ms.) Mark it as started on promotion instead (should be within a
>> few microseconds, if not ideally a few 10 ns)? Then you will also have
>> better symmetry in process_csb, suggesting that we can have a routine
>> that takes the current *execlists->active with fewer code changes.
> 
> Good point, I was disliking the csb latencies and completely missed the 
> preemption side of things. Symmetry will be much better in more than one 
> aspect.

Downside of having it in process_csb is really bad accuracy with short 
batches like gem_exec_nop. :( process_csb() latency I think. It gets a 
little bit better for this particular workload if I move the start point 
to submit_ports(), but that has that other problem with preemption.

After this woes I was hopeful pphwsp context runtime could have an 
advantage here, but then I discover it is occasionally not monotonic. At 
least with the spammy gem_exec_nop it occasionally but regularly jumps a 
tiny bit backward:

[ 8802.082980]  (new=7282101 old=7282063 d=38)
[ 8802.083007]  (new=7282139 old=7282101 d=38)
[ 8802.083051]  (new=7282250 old=7282139 d=111)
[ 8802.083077]  (new=7282214 old=7282250 d=-36)
[ 8802.083103]  (new=7282255 old=7282214 d=41)
[ 8802.083129]  (new=7282293 old=7282255 d=38)
[ 8802.083155]  (new=7282331 old=7282293 d=38)

Ouch. Time to sleep on it.

Regards,

Tvrtko