[Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Jul 18 12:26:08 UTC 2022


On 13/07/2022 00:31, John.C.Harrison at Intel.com wrote:
> From: Matthew Brost <matthew.brost at intel.com>
> 
> The engine registers really shouldn't be touched during GuC submission
> as the GuC owns the registers. Don't call ring_is_idle and tie

Touch being just read and it is somehow harmful?

> intel_engine_is_idle strictly to the engine pm.

Strictly seems wrong - it is just ring_is_idle check that is replaced 
and not the whole implementation of intel_engine_is_idle.

> Because intel_engine_is_idle tied to the engine pm, retire requests
> before checking intel_engines_are_idle in gt_drop_caches, and lastly
Why is re-ordering important? I at least can't understand it. I hope 
it's not working around IGT failures.

> increase the timeout in gt_drop_caches for the intel_engines_are_idle
> check.

Same here - why?

Regards,

Tvrtko

> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
>   drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
>   drivers/gpu/drm/i915/i915_drv.h           |  2 +-
>   3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 283870c659911..959a7c92e8f4d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
>   {
>   	bool idle = true;
>   
> +	/* GuC submission shouldn't access HEAD & TAIL via MMIO */
> +	GEM_BUG_ON(intel_engine_uses_guc(engine));
> +
>   	if (I915_SELFTEST_ONLY(!engine->mmio_base))
>   		return true;
>   
> @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>   	if (!i915_sched_engine_is_empty(engine->sched_engine))
>   		return false;
>   
> +	/*
> +	 * We shouldn't touch engine registers with GuC submission as the GuC
> +	 * owns the registers. Let's tie the idle to engine pm, at worst this
> +	 * function sometimes will falsely report non-idle when idle during the
> +	 * delay to retire requests or with virtual engines and a request
> +	 * running on another instance within the same class / submit mask.
> +	 */
> +	if (intel_engine_uses_guc(engine))
> +		return false;
> +
>   	/* Ring stopped? */
>   	return ring_is_idle(engine);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 94e5c29d2ee3a..ee5334840e9cb 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
>   {
>   	int ret;
>   
> +	if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
> +		intel_gt_retire_requests(gt);
> +
>   	if (val & DROP_RESET_ACTIVE &&
>   	    wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT))
>   		intel_gt_set_wedged(gt);
>   
> -	if (val & DROP_RETIRE)
> -		intel_gt_retire_requests(gt);
> -
>   	if (val & (DROP_IDLE | DROP_ACTIVE)) {
>   		ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
>   		if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c22f29c3faa0e..53c7474dde495 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -278,7 +278,7 @@ struct i915_gem_mm {
>   	u32 shrink_count;
>   };
>   
> -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
> +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
>   
>   unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915,
>   					 u64 context);


More information about the dri-devel mailing list