[Intel-gfx] [PATCH v3 2/3] drm/i915/guc: Close deregister-context race against CT-loss

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Tue Sep 26 18:21:11 UTC 2023


On Thu, 2023-09-14 at 11:34 -0400, Vivi, Rodrigo wrote:
> On Sat, Sep 09, 2023 at 08:58:45PM -0700, Alan Previn wrote:
alan:snip
> 
> > +	/* Change context state to destroyed and get gt-pm */
> > +	__intel_gt_pm_get(gt);
> > +	set_context_destroyed(ce);
> > +	clr_context_registered(ce);
> > +
> > +	ret = deregister_context(ce, ce->guc_id.id);
> > +	if (ret) {
> > +		/* Undo the state change and put gt-pm if that failed */
> > +		set_context_registered(ce);
> > +		clr_context_destroyed(ce);
> > +		intel_gt_pm_put(gt);
> 
> This is a might_sleep inside a spin_lock! Are you 100% confident no WARN
> was seeing during the tests indicated in the commit msg?
> 
> > +	}
> > +	spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > +
> > +	return 0;
> 
> If you are always returning 0, there's no pointing in s/void/int...
alan: Hi Rodrigo - i actually forget that i need the error value returned
so that _guc_context_destroy can put the context back into the
guc->submission_state.destroyed_contexts linked list. So i clearly missed returning
the error in the case deregister_context fails.
> 
> >  }
> >  
> >  static void __guc_context_destroy(struct intel_context *ce)
> > @@ -3268,7 +3296,22 @@ static void deregister_destroyed_contexts(struct intel_guc *guc)
> >  		if (!ce)
> >  			break;
> >  
> > -		guc_lrc_desc_unpin(ce);
> > +		if (guc_lrc_desc_unpin(ce)) {
> > +			/*
> > +			 * This means GuC's CT link severed mid-way which could happen
> > +			 * in suspend-resume corner cases. In this case, put the
> > +			 * context back into the destroyed_contexts list which will
> > +			 * get picked up on the next context deregistration event or
> > +			 * purged in a GuC sanitization event (reset/unload/wedged/...).
> > +			 */
> > +			spin_lock_irqsave(&guc->submission_state.lock, flags);
> > +			list_add_tail(&ce->destroyed_link,
> > +				      &guc->submission_state.destroyed_contexts);
> > +			spin_unlock_irqrestore(&guc->submission_state.lock, flags);
alan:snip


More information about the Intel-gfx mailing list