[Intel-gfx] [PATCH 02/17] drm/i915/ringbuffer: Brute force context restore

Mon Jun 11 10:07:45 UTC 2018

Chris Wilson <chris at chris-wilson.co.uk> writes:

> An issue encountered with switching mm on gen7 is that the GPU likes to
> hang (with the VS unit busy) when told to force restore the current
> context. We can simply workaround this by substituting the
> MI_FORCE_RESTORE flag with a round-trip through the kernel_context,
> forcing the context to be saved and restored; thereby reloading the
> PP_DIR registers and updating the modified page directory!
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Matthew Auld <matthew.william.auld at gmail.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 31 ++++++++++++++++++++++---
>  1 file changed, 28 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 65811e2fa7da..6bfa6030198d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  		(HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
>  		INTEL_INFO(i915)->num_rings - 1 :
>  		0;
> +	bool force_restore = false;
>  	int len;
>  	u32 *cs;
>  
> @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  	len = 4;
>  	if (IS_GEN7(i915))
>  		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
> +	if (flags & MI_FORCE_RESTORE) {
> +		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
> +		flags &= ~MI_FORCE_RESTORE;
> +		force_restore = true;
> +		len += 2;
> +	}
>  
>  	cs = intel_ring_begin(rq, len);
>  	if (IS_ERR(cs))
> @@ -1495,6 +1502,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  		}
>  	}
>  
> +	if (force_restore) {
> +		/*
> +		 * The HW doesn't handle being told to restore the current
> +		 * context very well. Quite often it likes goes to go off and
> +		 * sulk, especially when it is meant to be reloading PP_DIR.
> +		 * A very simple fix to force the reload is to simply switch
> +		 * away from the current context and back again.
> +		 */
> +		*cs++ = MI_SET_CONTEXT;
> +		*cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context,
> +							  engine)->state) |
> +			MI_MM_SPACE_GTT |
> +			MI_RESTORE_INHIBIT;

The above comment could be more verbose about the INHIBIT flag,
like we discussed in irc. We dont actually restore the kernel
context image state and we will trample it with current ctx.

But as we don't ever run anything through kernel_context,
this should be fine and creating an another context just
for switching through seems overkill.

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

> +	}
> +
>  	*cs++ = MI_NOOP;
>  	*cs++ = MI_SET_CONTEXT;
>  	*cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;
> @@ -1585,11 +1607,14 @@ static int switch_context(struct i915_request *rq)
>  
>  		to_mm->pd_dirty_rings &= ~intel_engine_flag(engine);
>  		engine->legacy_active_ppgtt = to_mm;
> -		hw_flags = MI_FORCE_RESTORE;
> +
> +		if (to_ctx == from_ctx) {
> +			hw_flags = MI_FORCE_RESTORE;
> +			from_ctx = NULL;
> +		}
>  	}
>  
> -	if (rq->hw_context->state &&
> -	    (to_ctx != from_ctx || hw_flags & MI_FORCE_RESTORE)) {
> +	if (rq->hw_context->state && to_ctx != from_ctx) {
>  		GEM_BUG_ON(engine->id != RCS);
>  
>  		/*
> -- 
> 2.17.1