[Intel-gfx] [RFC v2 0/6] Queued/runnable/running engine stats
Chris Wilson
chris at chris-wilson.co.uk
Mon Jan 22 18:52:54 UTC 2018
Quoting Tvrtko Ursulin (2018-01-22 18:43:52)
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> Per-engine queue depths are an interesting metric for analyzing the system load
> and also for users who wish to use it to load balance their submissions based
> on it.
>
> In this version I have split the metrics into three separate counters:
>
> 1. QUEUED - From execbuf time to request being runnable - runnable meaning until
> dependencies have been resolved and fences signaled.
> 2. RUNNABLE - From runnable to running on the GPU.
> 3. RUNNING - Running on the GPU.
>
> When inspected with perf stat the output looks roughly like this:
>
> # time counts unit events
> 201.160490145 0.01 i915/rcs0-queued/
> 201.160490145 19.13 i915/rcs0-runnable/
> 201.160490145 2.39 i915/rcs0-running/
>
> The reported numbers are average queue depths for the last query period.
>
> Having split out metrics should be more flexible for all users, and it is still
> possible to fetch an atomic snapshot of all using the perf groups for those
> wanting to combine them.
>
> For users wanting instantanous numbers instead of averaged, we could potentially
> expose them using the query API Lionel is working on.
> (https://patchwork.freedesktop.org/series/36622/)
>
> For instance a query packet could look like:
>
> #define DRM_I915_QUERY_ENGINE_QUEUES 0x04
>
> struct drm_i915_query_engine_queues {
> __u8 class;
> __u8 instance
>
> __u8 pad[2];
>
> __u32 queued;
> __u32 runnable;
> __u32 running;
> };
>
> I also have patches to expose this via intel-gpu-top, using the perf API.
Can you stick a ewma loadavg just after the hostname in intel-gpu-overlay,
pretty please? :)
-Chris
More information about the Intel-gfx
mailing list