[Intel-gfx] [PATCH v3 1/4] drm/i915: Get active pending request for given context

Tue Dec 11 11:58:11 UTC 2018

On 11/12/2018 10:14, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar at intel.com>
> 
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
> 
> V2:
>   * Change 64-bit to atomic for request count. (Tvrtko Ursulin)
> 
> V3:
>   * Remove mutex for request count.
>   * Rebase.
>   * Fixes hitting underflow for predictive request. (Tvrtko Ursulin)
> 
> Cc: Aravindan Muthukumar <aravindan.muthukumar at intel.com>
> Cc: Kedar J Karanje <kedar.j.karanje at intel.com>
> Cc: Yogesh Marathe <yogesh.marathe at intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>

No, I did not tag this with r-b and you are not allowed to do this!!

> Signed-off-by: Praveen Diwakar <praveen.diwakar at intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik at intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 1 +
>   drivers/gpu/drm/i915/i915_gem_context.h | 5 +++++
>   drivers/gpu/drm/i915/i915_request.c     | 2 ++
>   drivers/gpu/drm/i915/intel_lrc.c        | 2 ++
>   4 files changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index b10770c..0bcbe32 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   	}
>   
>   	trace_i915_context_create(ctx);
> +	atomic_set(&ctx->req_cnt, 0);
>   
>   	return ctx;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index b116e49..e824b15 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,11 @@ struct i915_gem_context {
>   	 * context close.
>   	 */
>   	struct list_head handles_list;
> +
> +	/** req_cnt: tracks the pending commands, based on which we decide to
> +	 * go for low/medium/high load configuration of the GPU.
> +	 */
> +	atomic_t req_cnt;
>   };
>   
>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 5c2c93c..b90795a 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1113,6 +1113,8 @@ void i915_request_add(struct i915_request *request)
>   	}
>   	request->emitted_jiffies = jiffies;
>   
> +	atomic_inc(&request->gem_context->req_cnt);
> +
>   	/*
>   	 * Let the backend know a new request has arrived that may need
>   	 * to adjust the existing execution schedule due to a high priority
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1744792..d33f5ac 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1127,6 +1127,8 @@ static void execlists_submit_request(struct i915_request *request)
>   	submit_queue(engine, rq_prio(request));
>   
>   	spin_unlock_irqrestore(&engine->timeline.lock, flags);
> +
> +	atomic_dec(&request->gem_context->req_cnt);
>   }
>   
>   static struct i915_request *sched_to_request(struct i915_sched_node *node)
> 

With such placement of accounting you are only considering requests 
which are not yet runnable (due fences and implicit dependencies). If on 
the contrary everything is runnable, and there is a lot of it waiting 
for the GPU to execute it, this counter will show zero. And you'll 
decide to run in a reduced slice/EU configuration. There has to be some 
benchmarks which shows the adverse effect of this, you just haven't 
found it yet I guess.

Regards,

Tvrtko