[Intel-gfx] [RFC 4/6] drm/i915/pmu: Add queued counter

Chris Wilson chris at chris-wilson.co.uk
Mon Jan 22 18:56:49 UTC 2018


Quoting Tvrtko Ursulin (2018-01-22 18:43:56)
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> We add a PMU counter to expose the number of requests which have been
> submitted from userspace but are not yet runnable due dependencies and
> unsignaled fences.
> 
> This is useful to analyze the overall load of the system.
> 
> v2:
>  * Rebase for name change and re-order.
>  * Drop floating point constant. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pmu.c         | 40 +++++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
>  include/uapi/drm/i915_drm.h             |  9 +++++++-
>  3 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index cbfca4a255ab..8eefdf09a30a 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -36,7 +36,8 @@
>  #define ENGINE_SAMPLE_MASK \
>         (BIT(I915_SAMPLE_BUSY) | \
>          BIT(I915_SAMPLE_WAIT) | \
> -        BIT(I915_SAMPLE_SEMA))
> +        BIT(I915_SAMPLE_SEMA) | \
> +        BIT(I915_SAMPLE_QUEUED))
>  
>  #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
>  
> @@ -220,6 +221,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
>  
>                 update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA],
>                               PERIOD, !!(val & RING_WAIT_SEMAPHORE));
> +
> +               if (engine->pmu.enable & BIT(I915_SAMPLE_QUEUED))
> +                       update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
> +                                     I915_SAMPLE_QUEUED_DIVISOR,
> +                                     atomic_read(&engine->request_stats.queued));

engine->request_stats.foo works for me, and reads quite nicely.

> +/* No brackets or quotes below please. */
> +#define I915_SAMPLE_QUEUED_SCALE 0.01

> + /* Divide counter value by divisor to get the real value. */
> +#define I915_SAMPLE_QUEUED_DIVISOR (100)

I'm just thinking of favouring the sampler arithmetic by using 128. As
far as userspace the difference is not going to that noticeable, less if
you chose 256.
-Chris


More information about the Intel-gfx mailing list