[Intel-gfx] [PATCH 14/19] drm/i915: Reconstruct active state on starting busy-stats

Chris Wilson chris at chris-wilson.co.uk
Mon Jan 8 14:52:24 UTC 2018


Quoting Tvrtko Ursulin (2018-01-08 14:44:27)
> 
> On 08/01/2018 14:29, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-01-08 14:18:53)
> >>
> >> On 02/01/2018 15:12, Chris Wilson wrote:
> >>> We have a hole in our busy-stat accounting if the pmu is enabled during
> >>> a long running batch, the pmu will not start accumulating busy-time
> >>> until the next context switch. This then fails tests that are only
> >>> sampling a single batch.
> >>
> >> Oops, something in recent perf_pmu rework uncovered this?
> > 
> > Yeah, the tweaks reduced some tests to submitting a single batch started
> > before the counters. Previously all tests started the counters then
> > submitted the workload. I had a test planned, obviously forgot to write
> > it. I couldn't think of a way to guarantee hitting the lite-restore case
> > thought.
> 
> Spin batch from ctx 1 and another to ctx 2, after a short delay, 
> wouldn't do it? Or at least reliably?

You need:

spin batch ctx1
short batch ctx2
spin batch ctx2

Terminate spin_batch_ctx1 and then you have a very small window where we
get the CS interrupt from ctx1 and we resubmit ctx2 with both short+spin
batches, up until we then get the lite-restore CS interrupt.

That's the window you have to hit, a few microseconds.

> 
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/intel_engine_cs.c | 9 ++++++++-
> >>>    1 file changed, 8 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> >>> index ebdcbcbacb3c..c2f01ff96ce1 100644
> >>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> >>> @@ -1946,8 +1946,15 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
> >>>        spin_lock_irqsave(&engine->stats.lock, flags);
> >>>        if (engine->stats.enabled == ~0)
> >>>                goto busy;
> >>> -     if (engine->stats.enabled++ == 0)
> >>> +     if (engine->stats.enabled++ == 0) {
> >>>                engine->stats.enabled_at = ktime_get();
> >>> +
> >>> +             /* XXX submission method oblivious */
> >>
> >> You mean once busy stats support for the GuC gets added?
> > 
> > I was just thinking that we shouldn't be poking into execlists here and
> > that we need an agnostic method for execlists to tell busy_stats that it
> > is starting active. Or something.
> 
> Okay, yes. Would need to define what the API returns explicitly, perhaps 
> a number of context submitted to the hardware. But that could be trouble 
> for the GuC backend. Will think about it.
> 
> >>> +             engine->stats.active = port_count(&engine->execlists.port[1]);
> >>> +             engine->stats.active += port_count(&engine->execlists.port[0]);
> >>
> >> Hm, wouldn't this have the potential to over-account the active counter
> >> resulting in permanent 100%? Port count is [0, 2], where 2 is
> >> re-submission of port 0, while engine->stats.active is [0, 2], where
> >> this 2 is number of ports. In other words I think the correct method
> >> would be:
> >>
> >> if (port_count(0))
> >>          active++;
> >> if (port_count(1))
> >>          active++;
> > 
> > I thought we needed for stats.active to reflect the number of pending
> > context_out we will report. So for the lite-restore case, I thought 2
> > was the correct value.
> 
> Hm, so am I mistaken that we would get still only one context_out 
> (port->count to zero transition) from port0 in lite-restore case? I 
> can't spot it in the code at the moment that I am, but who knows.

Nope, my mistake, It is just the transition to/from count==0 that are
relevant. More reason to try and hit the lite-restore window :)
-Chris


More information about the Intel-gfx mailing list