[Intel-gfx] [PATCH v4 0/7] Queued/runnable/running engine stats
tursulin at ursulin.net
Mon Mar 19 18:16:18 UTC 2018
From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Per-engine queue depths are an interesting metric for analyzing the system load
and also for users who wish to use it to load balance their submissions based
In this version I have split the metrics into three separate counters:
1. QUEUED - From execbuf time to request being runnable - runnable meaning until
dependencies have been resolved and fences signaled.
2. RUNNABLE - From runnable to running on the GPU.
3. RUNNING - Running on the GPU.
When inspected with perf stat the output looks roughly like this:
# time counts unit events
201.160490145 0.01 i915/rcs0-queued/
201.160490145 19.13 i915/rcs0-runnable/
201.160490145 2.39 i915/rcs0-running/
The reported numbers are average queue depths for the last query period.
* Review feedback (see patch changelogs).
* Renamed the counters and re-ordered some patches.
* Review feedback and rebase.
* Addition of last patch in the series, which supports a customer requirement
to expose instantaneous queue values via the i915 query API.
Tvrtko Ursulin (7):
drm/i915/pmu: Fix enable count array size and bounds checking
drm/i915: Keep a count of requests waiting for a slot on GPU
drm/i915: Keep a count of requests submitted from userspace
drm/i915/pmu: Add queued counter
drm/i915/pmu: Add runnable counter
drm/i915/pmu: Add running counter
drm/i915: Engine queues query
drivers/gpu/drm/i915/i915_pmu.c | 81 +++++++++++++++++++++++++++++----
drivers/gpu/drm/i915/i915_query.c | 43 +++++++++++++++++
drivers/gpu/drm/i915/i915_request.c | 10 ++++
drivers/gpu/drm/i915/intel_engine_cs.c | 6 ++-
drivers/gpu/drm/i915/intel_lrc.c | 1 +
drivers/gpu/drm/i915/intel_ringbuffer.h | 21 ++++++++-
include/uapi/drm/i915_drm.h | 45 +++++++++++++++++-
7 files changed, 194 insertions(+), 13 deletions(-)
More information about the Intel-gfx