[Intel-gfx] [RFC 08/12] drm/i915: Expose per-engine client busyness
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Mar 11 10:17:21 UTC 2020
On 10/03/2020 20:12, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-03-10 20:04:23)
>>
>> On 10/03/2020 18:32, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-03-09 18:31:25)
>>>> +static ssize_t
>>>> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
>>>> +{
>>>> + struct i915_engine_busy_attribute *i915_attr =
>>>> + container_of(attr, typeof(*i915_attr), attr);
>>>> + unsigned int class = i915_attr->engine_class;
>>>> + struct i915_drm_client *client = i915_attr->client;
>>>> + u64 total = atomic64_read(&client->past_runtime[class]);
>>>> + struct list_head *list = &client->ctx_list;
>>>> + struct i915_gem_context *ctx;
>>>> +
>>>> + rcu_read_lock();
>>>> + list_for_each_entry_rcu(ctx, list, client_link) {
>>>> + total += atomic64_read(&ctx->past_runtime[class]);
>>>> + total += pphwsp_busy_add(ctx, class);
>>>> + }
>>>> + rcu_read_unlock();
>>>> +
>>>> + total *= RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
>>>
>>> Planning early retirement? In 600 years, they'll have forgotten how to
>>> email ;)
>>
>> Shruggety shrug. :) I am guessing you would prefer both internal
>> representations (sw and pphwsp runtimes) to be consistently in
>> nanoseconds? I thought why multiply at various places when once at the
>> readout time is enough.
>
> It's fine. I was just double checking overflow, and then remembered the
> end result is 64b nanoseconds.
>
> Keep the internal representation convenient for accumulation, and the
> conversion at the boundary.
>
>> And I should mention again how I am not sure at the moment how to meld
>> the two stats into one more "perfect" output.
>
> One of the things that crossed my mind was wondering if it was possible
> to throw in a pulse before reading the stats (if active etc). Usual
> dilemma with non-preemptible contexts, so probably not worth it as those
> hogs will remain hogs.
>
> And I worry about the disparity between sw busy and hw runtime.
How about I stop tracking accumulated sw runtime and just use it for the
active portion. So reporting back hw runtime + sw active runtime. In
other words sw tracking only covers the portion between context_in and
context_out. Sounds worth a try.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list