[Intel-gfx] [PATCH v4 1/3] drm/i915: Get active pending request for given context
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Mar 14 10:39:07 UTC 2019
On 14/03/2019 08:36, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar at intel.com>
>
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
>
> V2:
> * Change 64-bit to atomic for request count. (Tvrtko Ursulin)
>
> V3:
> * Remove mutex for request count.
> * Rebase.
> * Fixes hitting underflow for predictive request. (Tvrtko Ursulin)
>
> V4:
> * Rebase.
>
> Cc: Aravindan Muthukumar <aravindan.muthukumar at intel.com>
> Cc: Kedar J Karanje <kedar.j.karanje at intel.com>
> Cc: Yogesh Marathe <yogesh.marathe at intel.com>
> Signed-off-by: Praveen Diwakar <praveen.diwakar at intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik at intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_context.c | 1 +
> drivers/gpu/drm/i915/i915_gem_context.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++
> drivers/gpu/drm/i915/intel_lrc.c | 3 +++
> 4 files changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 280813a..a5876fe 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -453,6 +453,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
> }
>
> trace_i915_context_create(ctx);
> + atomic_set(&ctx->req_cnt, 0);
>
> return ctx;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index ca150a7..c940168 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -227,6 +227,11 @@ struct i915_gem_context {
> * context close.
> */
> struct list_head handles_list;
> +
> + /** req_cnt: tracks the pending commands, based on which we decide to
> + * go for low/medium/high load configuration of the GPU.
> + */
> + atomic_t req_cnt;
> };
>
> static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 02adcaf..a38963b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2475,6 +2475,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> */
> eb.request->batch = eb.batch;
>
> + atomic_inc(&eb.ctx->req_cnt);
> +
> trace_i915_request_queue(eb.request, eb.batch_flags);
> err = eb_submit(&eb);
> err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 34a0866..d0af37d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -780,6 +780,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>
> last = rq;
> submit = true;
> +
> + if (atomic_read(&rq->gem_context->req_cnt) > 0)
> + atomic_dec(&rq->gem_context->req_cnt);
Every review round I keep pointing out this is wrong and you keep
persisting to have it like this. :( Would you in my place want to keep
reviewing in these circumstances?
As Chris also pointed out it is not atomic so in theory doesn't even
protect against underflow and I repeat, yet again, is a hint the
placement is unbalanced.
I pointed you to some of my old patches which do correct accounting of
per engine queued / runnable / running, to give an idea where to put the
inc/dec.
Chris now also suggests maybe to tie it with timelines (AFAIU), so you
could also see where requests are added and removed from the
ce->ring->timeline list.
You say you tried something with my patches but "didnt see much
power benefit with that" - what exactly have you tried? And what is not
much?
I am still curious what metric works for this. The one you implement is
something like queued + _sometimes_ runnable. Sometimes because you may
or may not be decrementing runnable depending on ELSP contention. And
number of ELSP slots may change with GuC and/or Gen11 so I worry it is
way to undefined even ignoring the underflow issue.
Regards,
Tvrtko
> }
>
> rb_erase_cached(&p->node, &execlists->queue);
>
More information about the Intel-gfx
mailing list