[Intel-gfx] [PATCH 1/6] drm/i915: move the pre_pin earlier

Thomas Hellström thomas.hellstrom at linux.intel.com
Wed Nov 17 11:29:54 UTC 2021


On Fri, 2021-11-12 at 15:32 +0000, Matthew Auld wrote:
> In intel_context_do_pin_ww, when calling into the pre_pin hook(which
> is
> passed the ww context) it could in theory return -EDEADLK(which is
> very
> likely with debug kernels), once we start adding more ww locking in
> there,
> like in the next patch. If so then we need to be mindful of having to
> restart the do_pin at this point.
> 
> If this is the kernel_context, or some other early in-kernel context
> where we have yet to setup the default_state, then we always inhibit
> the
> context restore, and instead rely on the delayed active_release to
> set
> the CONTEXT_VALID_BIT for us(if we even care), which should indicate
> that we have context switched away, and that our newly saved context
> state should now be valid. However, since we currently grab the
> active
> reference before the potential ww dance, we can end up setting the
> CONTEXT_VALID_BIT much too early, if we need to backoff, and then
> upon
> re-trying the do_pin, we could potentially cause the hardware to
> incorrectly load some garbage context state when later context
> switching
> to that context, but at the very least this will trigger the
> GEM_BUG_ON() in __engine_unpark. For now let's just move any ww dance
> stuff prior to arming the active reference.
> 
> For normal user contexts this shouldn't be a concern, since we should
> already have the default_state ready when initialising the lrc state,
> and so there should be no concern with active_release somehow
> prematurely setting the CONTEXT_VALID_BIT.
> 
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_context.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c
> b/drivers/gpu/drm/i915/gt/intel_context.c
> index 5634d14052bc..ad44860faaf3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -228,17 +228,17 @@ int __intel_context_do_pin_ww(struct
> intel_context *ce,
>         if (err)
>                 return err;
>  
> -       err = i915_active_acquire(&ce->active);
> +       err = ce->ops->pre_pin(ce, ww, &vaddr);
>         if (err)
>                 goto err_ctx_unpin;
>  
> -       err = ce->ops->pre_pin(ce, ww, &vaddr);
> +       err = i915_active_acquire(&ce->active);
>         if (err)
> -               goto err_release;
> +               goto err_post_unpin;

Hmm, If i915_active_acquire() fails, wouldn't we end up calling
i915_active_release() here?

Thanks,
Thomas




More information about the Intel-gfx mailing list