[Intel-gfx] [RFC v2 0/6] Queued/runnable/running engine stats
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Jan 24 18:01:14 UTC 2018
On 22/01/2018 18:52, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-22 18:43:52)
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Per-engine queue depths are an interesting metric for analyzing the system load
>> and also for users who wish to use it to load balance their submissions based
>> on it.
>>
>> In this version I have split the metrics into three separate counters:
>>
>> 1. QUEUED - From execbuf time to request being runnable - runnable meaning until
>> dependencies have been resolved and fences signaled.
>> 2. RUNNABLE - From runnable to running on the GPU.
>> 3. RUNNING - Running on the GPU.
>>
>> When inspected with perf stat the output looks roughly like this:
>>
>> # time counts unit events
>> 201.160490145 0.01 i915/rcs0-queued/
>> 201.160490145 19.13 i915/rcs0-runnable/
>> 201.160490145 2.39 i915/rcs0-running/
>>
>> The reported numbers are average queue depths for the last query period.
>>
>> Having split out metrics should be more flexible for all users, and it is still
>> possible to fetch an atomic snapshot of all using the perf groups for those
>> wanting to combine them.
>>
>> For users wanting instantanous numbers instead of averaged, we could potentially
>> expose them using the query API Lionel is working on.
>> (https://patchwork.freedesktop.org/series/36622/)
>>
>> For instance a query packet could look like:
>>
>> #define DRM_I915_QUERY_ENGINE_QUEUES 0x04
>>
>> struct drm_i915_query_engine_queues {
>> __u8 class;
>> __u8 instance
>>
>> __u8 pad[2];
>>
>> __u32 queued;
>> __u32 runnable;
>> __u32 running;
>> };
>>
>> I also have patches to expose this via intel-gpu-top, using the perf API.
>
> Can you stick a ewma loadavg just after the hostname in intel-gpu-overlay,
> pretty please? :)
Sure, just one period and all three counters aggregated?
Regards,
Tvrtko
More information about the Intel-gfx
mailing list