[Intel-gfx] [PATCH 14/19] drm/i915: Reconstruct active state on starting busy-stats

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Jan 8 14:44:27 UTC 2018


On 08/01/2018 14:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-08 14:18:53)
>>
>> On 02/01/2018 15:12, Chris Wilson wrote:
>>> We have a hole in our busy-stat accounting if the pmu is enabled during
>>> a long running batch, the pmu will not start accumulating busy-time
>>> until the next context switch. This then fails tests that are only
>>> sampling a single batch.
>>
>> Oops, something in recent perf_pmu rework uncovered this?
> 
> Yeah, the tweaks reduced some tests to submitting a single batch started
> before the counters. Previously all tests started the counters then
> submitted the workload. I had a test planned, obviously forgot to write
> it. I couldn't think of a way to guarantee hitting the lite-restore case
> thought.

Spin batch from ctx 1 and another to ctx 2, after a short delay, 
wouldn't do it? Or at least reliably?

>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_engine_cs.c | 9 ++++++++-
>>>    1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index ebdcbcbacb3c..c2f01ff96ce1 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -1946,8 +1946,15 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>>>        spin_lock_irqsave(&engine->stats.lock, flags);
>>>        if (engine->stats.enabled == ~0)
>>>                goto busy;
>>> -     if (engine->stats.enabled++ == 0)
>>> +     if (engine->stats.enabled++ == 0) {
>>>                engine->stats.enabled_at = ktime_get();
>>> +
>>> +             /* XXX submission method oblivious */
>>
>> You mean once busy stats support for the GuC gets added?
> 
> I was just thinking that we shouldn't be poking into execlists here and
> that we need an agnostic method for execlists to tell busy_stats that it
> is starting active. Or something.

Okay, yes. Would need to define what the API returns explicitly, perhaps 
a number of context submitted to the hardware. But that could be trouble 
for the GuC backend. Will think about it.

>>> +             engine->stats.active = port_count(&engine->execlists.port[1]);
>>> +             engine->stats.active += port_count(&engine->execlists.port[0]);
>>
>> Hm, wouldn't this have the potential to over-account the active counter
>> resulting in permanent 100%? Port count is [0, 2], where 2 is
>> re-submission of port 0, while engine->stats.active is [0, 2], where
>> this 2 is number of ports. In other words I think the correct method
>> would be:
>>
>> if (port_count(0))
>>          active++;
>> if (port_count(1))
>>          active++;
> 
> I thought we needed for stats.active to reflect the number of pending
> context_out we will report. So for the lite-restore case, I thought 2
> was the correct value.

Hm, so am I mistaken that we would get still only one context_out 
(port->count to zero transition) from port0 in lite-restore case? I 
can't spot it in the code at the moment that I am, but who knows.

Regards,

Tvrtko


More information about the Intel-gfx mailing list