[PATCH v5 5/6] drm/i915/pxp: Trigger the global teardown for before suspending
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Thu Jan 19 22:31:21 UTC 2023
Thanks for reviewing - responses below.
On Thu, 2023-01-19 at 14:35 -0500, Vivi, Rodrigo wrote:
> On Thu, Jan 12, 2023 at 05:18:49PM -0800, Alan Previn wrote:
> > A driver bug was recently discovered where the security firmware was
> > receiving internal HW signals indicating that session key expirations
> > had occurred. Architecturally, the firmware was expecting a response
> > from the GuC to acknowledge the event with the firmware side.
> > However the OS was in a suspended state and GuC had been reset.
> >
> > Internal specifications actually required the driver to ensure
> > that all active sessions be properly cleaned up in such cases where
> > the system is suspended and the GuC potentially unable to respond.
> >
> > This patch adds the global teardown code in i915's suspend_prepare
> > code path.
> >
> > Signed-off-by: Alan Previn <alan.previn.teres.alexis at intel.com>
> > Reviewed-by: Juston Li <justonli at chromium.org>
> >
Alan: [snip]
> >
> > +static int __pxp_global_teardown_locked(struct intel_pxp *pxp, bool terminate_for_cleanup)
> > +{
> > + if (terminate_for_cleanup) {
> > + if (!pxp->arb_is_valid)
> > + return 0;
> > + /*
> > + * To ensure synchronous and coherent session teardown completion
> > + * in response to suspend or shutdown triggers, don't use a worker.
> > + */
> > + intel_pxp_mark_termination_in_progress(pxp);
> > + intel_pxp_terminate(pxp, false);
> > + } else {
> > + if (pxp->arb_is_valid)
> > + return 0;
> > + /*
> > + * If we are not in final termination, and the arb-session is currently
> > + * inactive, we are doing a reset and restart due to some runtime event.
> > + * Use the worker that was designed for this.
> > + */
> > + pxp_queue_termination(pxp);
> > + }
>
> I really don't see why you need 1 function for totally 2 different cases.
> Why not 2 functions then?
>
Alan: I don't see why not ;) My goal with above method was was to concentrate the teardown steps in a single function so if future changes are required, we can keep it in this single function entry point. For now i will assume that was a nack so i shall split it on next rev.
> > +
> > + if (!wait_for_completion_timeout(&pxp->termination, msecs_to_jiffies(250)))
> > + return -ETIMEDOUT;
> > +
> > + return 0;
> > +}
> > +
> >
Alan: [snip]
> > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h
> > index 9658d3005222..3ded0890cd27 100644
> > --- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
> > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
> > @@ -27,6 +27,7 @@ void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp);
> > void intel_pxp_tee_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id);
> >
> > int intel_pxp_start(struct intel_pxp *pxp);
> > +void intel_pxp_end(struct intel_pxp *pxp);
> >
> > int intel_pxp_key_check(struct intel_pxp *pxp,
> > struct drm_i915_gem_object *obj,
> > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
> > index 892d39cc61c1..e427464aa131 100644
> > --- a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
> > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
> > @@ -16,7 +16,7 @@ void intel_pxp_suspend_prepare(struct intel_pxp *pxp)
> > if (!intel_pxp_is_enabled(pxp))
> > return;
> >
> > - pxp->arb_is_valid = false;
> > + intel_pxp_end(pxp);
> >
> > intel_pxp_invalidate(pxp);
> > }
> > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
> > index 74ed7e16e481..d8278c4002e3 100644
> > --- a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
> > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
> > @@ -115,11 +115,14 @@ static int pxp_terminate_arb_session_and_global(struct intel_pxp *pxp)
> > return ret;
> > }
> >
> > -static void pxp_terminate(struct intel_pxp *pxp)
> > +void intel_pxp_terminate(struct intel_pxp *pxp, bool restart_arb)
> > {
> > int ret;
> >
> > - pxp->hw_state_invalidated = true;
> > + if (restart_arb)
> > + pxp->hw_state_invalidated = true;
> > + else
> > + pxp->hw_state_invalidated = false;
>
> o.O
>
> pxp->hw_state_invalidate = restart_arb;
Alan: duhhhh... (my bad)
>
> ?
>
> or even a better name for the restart_arb to already indicate that is
> the hw_state_invalidate ?
>
Alan: hmmm... you something mean like:
hw_state_invalidated = post_invalidation_needs_restart;
Alan: actually i wish we couold redo "hw_state_invalidate" which is currently defined
as a boolean that only means one thing -> teardown and restart. It would be more scalable
if we can replace it with a bitmask of "current + (infered)pending state" with a documented
state-machine with a fixed set of state-transition paths.
INACTIVE----> STARTING----> ACTIVE ----> TEARDOWN_RESTART--->|
^ ^ | |
| | | V
| |<--------------)----------<---------------|
| |
| |-----> TEARDOWN_END---->--|
| V
|<-----------------<----------------<------------------|
However, I didn't do this initially because it would mean a wider set of changes that might
take more time to test and review (downstream customers impacts) but for only 5 states but
where only 2 of em are impacted by this change. For now i shall go with the simpler name change
as you hint above - unless you request this instead.
Alan: [snip]
More information about the dri-devel
mailing list