[Intel-gfx] [PATCH] drm/i915/ringbuffer: Move double invalidate to after pd flush

Tue Sep 4 13:22:17 UTC 2018

Chris Wilson <chris at chris-wilson.co.uk> writes:

> Continuing the fun of trying to find exactly the delay that is
> sufficient to ensure that the page directory is fully loaded between
> context switches, move the extra flush added in commit 70b73f9ac113
> ("drm/i915/ringbuffer: Delay after invalidating gen6+ xcs") to just
> after we flush the pd. Entirely based on the empirical data of running
> failing tests in a loop until we survive a day (before the mtbf is 10-30
> minutes).
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107769
> References: 70b73f9ac113 ("drm/i915/ringbuffer: Delay after invalidating gen6+ xcs")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 40 ++++++++++++++-----------
>  1 file changed, 22 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 86604dd1c5a5..472939f5c18f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1707,9 +1707,29 @@ static int switch_context(struct i915_request *rq)
>  	}
>  
>  	if (ppgtt) {
> +		ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> +		if (ret)
> +			goto err_mm;
> +
>  		ret = flush_pd_dir(rq);
>  		if (ret)
>  			goto err_mm;
> +
> +		/*
> +		 * Not only do we need a full barrier (post-sync write) after
> +		 * invalidating the TLBs, but we need to wait a little bit
> +		 * longer. Whether this is merely delaying us, or the
> +		 * subsequent flush is a key part of serialising with the
> +		 * post-sync op, this extra pass appears vital before a
> +		 * mm switch!
> +		 */
> +		ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> +		if (ret)
> +			goto err_mm;
> +
> +		ret = engine->emit_flush(rq, EMIT_FLUSH);
> +		if (ret)
> +			goto err_mm;

Someone said that proof is in the pudding. Just could
be more fun if someone would show us the recipe.

Acked-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

>  	}
>  
>  	if (ctx->remap_slice) {
> @@ -1947,7 +1967,7 @@ static void gen6_bsd_submit_request(struct i915_request *request)
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  }
>  
> -static int emit_mi_flush_dw(struct i915_request *rq, u32 flags)
> +static int mi_flush_dw(struct i915_request *rq, u32 flags)
>  {
>  	u32 cmd, *cs;
>  
> @@ -1985,23 +2005,7 @@ static int emit_mi_flush_dw(struct i915_request *rq, u32 flags)
>  
>  static int gen6_flush_dw(struct i915_request *rq, u32 mode, u32 invflags)
>  {
> -	int err;
> -
> -	/*
> -	 * Not only do we need a full barrier (post-sync write) after
> -	 * invalidating the TLBs, but we need to wait a little bit
> -	 * longer. Whether this is merely delaying us, or the
> -	 * subsequent flush is a key part of serialising with the
> -	 * post-sync op, this extra pass appears vital before a
> -	 * mm switch!
> -	 */
> -	if (mode & EMIT_INVALIDATE) {
> -		err = emit_mi_flush_dw(rq, invflags);
> -		if (err)
> -			return err;
> -	}
> -
> -	return emit_mi_flush_dw(rq, 0);
> +	return mi_flush_dw(rq, mode & EMIT_INVALIDATE ? invflags : 0);
>  }
>  
>  static int gen6_bsd_ring_flush(struct i915_request *rq, u32 mode)
> -- 
> 2.19.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx