[Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Close deregister-context race against CT-loss
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Tue Aug 15 19:08:15 UTC 2023
On Tue, 2023-08-15 at 09:56 -0400, Vivi, Rodrigo wrote:
> On Mon, Aug 14, 2023 at 06:12:09PM -0700, Alan Previn wrote:
> > If we are at the end of suspend or very early in resume
> > its possible an async fence signal could lead us to the
> > execution of the context destruction worker (after the
> > prior worker flush).
[snip]
>
> > @@ -3199,10 +3206,20 @@ static inline void guc_lrc_desc_unpin(struct intel_context *ce)
> > if (unlikely(disabled)) {
> > release_guc_id(guc, ce);
> > __guc_context_destroy(ce);
> > - return;
> > + return 0;
> > + }
> > +
> > + if (deregister_context(ce, ce->guc_id.id)) {
> > + /* Seal race with concurrent suspend by unrolling */
> > + spin_lock_irqsave(&ce->guc_state.lock, flags);
> > + set_context_registered(ce);
> > + clr_context_destroyed(ce);
> > + intel_gt_pm_put(gt);
>
> how sure we are this is not calling unbalanced puts?
alan: in this function (guc_lrc_desc_unpin), the summarized flow is:
check-status-stuff
if guc-is-not-disabled, take-pm, change ctx-state-bits
[implied else] if guc-is-disabled, scub-ctx and return
thus derigster_context is only called if we didnt exit, i.e. when guc-is-not-disabled, i.e. after the pm was taken.
> could we wrap this in some existent function to make this clear?
alan: yeah - not so readible as it now - let me tweak this function and make it cleaner
>
> > + spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > + return -ENODEV;
> > }
> >
> > - deregister_context(ce, ce->guc_id.id);
> > + return 0;
> > }
More information about the Intel-gfx
mailing list