[Intel-gfx] [PATCH 02/17] drm/i915/ringbuffer: Brute force context restore

Mon Jun 11 10:00:31 UTC 2018

On 10/06/2018 20:43, Chris Wilson wrote:
> An issue encountered with switching mm on gen7 is that the GPU likes to
> hang (with the VS unit busy) when told to force restore the current
> context. We can simply workaround this by substituting the
> MI_FORCE_RESTORE flag with a round-trip through the kernel_context,
> forcing the context to be saved and restored; thereby reloading the
> PP_DIR registers and updating the modified page directory!
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Matthew Auld <matthew.william.auld at gmail.com>
> ---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 31 ++++++++++++++++++++++---
>   1 file changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 65811e2fa7da..6bfa6030198d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1458,6 +1458,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   		(HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
>   		INTEL_INFO(i915)->num_rings - 1 :
>   		0;
> +	bool force_restore = false;
>   	int len;
>   	u32 *cs;
>   
> @@ -1471,6 +1472,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   	len = 4;
>   	if (IS_GEN7(i915))
>   		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
> +	if (flags & MI_FORCE_RESTORE) {
> +		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
> +		flags &= ~MI_FORCE_RESTORE;
> +		force_restore = true;
> +		len += 2;
> +	}
>   
>   	cs = intel_ring_begin(rq, len);
>   	if (IS_ERR(cs))
> @@ -1495,6 +1502,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>   		}
>   	}
>   
> +	if (force_restore) {
> +		/*
> +		 * The HW doesn't handle being told to restore the current
> +		 * context very well. Quite often it likes goes to go off and
> +		 * sulk, especially when it is meant to be reloading PP_DIR.
> +		 * A very simple fix to force the reload is to simply switch
> +		 * away from the current context and back again.
> +		 */
> +		*cs++ = MI_SET_CONTEXT;
> +		*cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context,
> +							  engine)->state) |
> +			MI_MM_SPACE_GTT |
> +			MI_RESTORE_INHIBIT;
> +	}
> +
>   	*cs++ = MI_NOOP;
>   	*cs++ = MI_SET_CONTEXT;
>   	*cs++ = i915_ggtt_offset(rq->hw_context->state) | flags;
> @@ -1585,11 +1607,14 @@ static int switch_context(struct i915_request *rq)
>   
>   		to_mm->pd_dirty_rings &= ~intel_engine_flag(engine);
>   		engine->legacy_active_ppgtt = to_mm;
> -		hw_flags = MI_FORCE_RESTORE;
> +
> +		if (to_ctx == from_ctx) {

Contexts can be the same here , when the parent condition is "if (to_mm 
!= from_mm || to_mm ...)" ?

> +			hw_flags = MI_FORCE_RESTORE;
> +			from_ctx = NULL;

Now on the error path we can end up with engine->legacy_active_context 
== NULL, but commands to switch have been put to the ring. Is that OK?

> +		}
>   	}
>   
> -	if (rq->hw_context->state &&
> -	    (to_ctx != from_ctx || hw_flags & MI_FORCE_RESTORE)) {
> +	if (rq->hw_context->state && to_ctx != from_ctx) {
>   		GEM_BUG_ON(engine->id != RCS);
>   
>   		/*
> 

Regards,

Tvrtko