[Intel-gfx] [RFC 05/12] drm/i915: Track runtime spent in unreachable intel_contexts

Tue Mar 10 20:00:18 UTC 2020

On 10/03/2020 18:25, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-03-09 18:31:22)
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> As contexts are abandoned we want to remember how much GPU time they used
>> (per class) so later we can used it for smarter purposes.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c       | 13 ++++++++++++-
>>   drivers/gpu/drm/i915/gem/i915_gem_context_types.h |  5 +++++
>>   2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> index abc3a3e2fcf1..5f6861a36655 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> @@ -257,7 +257,19 @@ static void free_engines_rcu(struct rcu_head *rcu)
>>   {
>>          struct i915_gem_engines *engines =
>>                  container_of(rcu, struct i915_gem_engines, rcu);
>> +       struct i915_gem_context *ctx = engines->ctx;
>> +       struct i915_gem_engines_iter it;
>> +       struct intel_context *ce;
>> +
>> +       /* Transfer accumulated runtime to the parent GEM context. */
>> +       for_each_gem_engine(ce, engines, it) {
>> +               unsigned int class = ce->engine->uabi_class;
>>   
>> +               GEM_BUG_ON(class >= ARRAY_SIZE(ctx->past_runtime));
>> +               atomic64_add(ce->runtime.total, &ctx->past_runtime[class]);
>> +       }
> 
> -> give this its own routine.

Ack.

>> +
>> +       i915_gem_context_put(ctx);
>>          i915_sw_fence_fini(&engines->fence);
>>          free_engines(engines);
>>   }
>> @@ -540,7 +552,6 @@ static int engines_notify(struct i915_sw_fence *fence,
>>                          list_del(&engines->link);
>>                          spin_unlock_irqrestore(&ctx->stale.lock, flags);
>>                  }
>> -               i915_gem_context_put(engines->ctx);
> 
> Or accumulate here? Here we know the engines are idle and released,
> albeit there is the delay in accumulating after the swap. I'm not going
> to worry about that, live replacement of engines I don't expect anyone
> to notice the busy stats being off for a bit. Worst case is that they
> see a sudden jump; but typical practice will be to setup engines up
> before they being activity. We only have to worry about is if the
> transient misleading stats can be exploited.

It was even here initially but then I started fearing it may not be the 
last unpin of intel_context, pending context save/complete so sounded 
safer to make it really really last. And

But I guess you are right in saying that small error when replacing 
engines should not be large concern. If I move the accumulation back 
here I don't need the intel_context->closed patch any more so that's a plus.

Unless it can be a large error if context ran for quite some time. Hm.. 
I think I still prefer to be safe and accumulate latest as possible.

Regards,

Tvrtko