[Intel-xe] [PATCH 2/2] RFC drm/xe: Disable GuC CT communication for D3Hot Transition

Thu Sep 21 18:37:05 UTC 2023

On Thu, Aug 31, 2023 at 03:21:18PM -0400, Rodrigo Vivi wrote:
> On Thu, Aug 31, 2023 at 12:02:31PM -0700, Ceraolo Spurio, Daniele wrote:
> > 
> > 
> > On 8/22/2023 10:09 PM, Riana Tauro wrote:
> > > During Runtime suspend, GuC is reset for both D0->D3hot/D3Cold
> > > transistions. It is not necessary for GuC to reset for D0 -> D3hot,
> > > only enable/disable ctb communication.
> > > 
> > > Add a function that enables/disables CT communication when d3cold
> > > is not allowed.
> > > 
> > > Signed-off-by: Riana Tauro <riana.tauro at intel.com>
> > 
> > xe_gt_suspend and xe_gt_resume do more things than just resetting the GuC
> > (e.g. marking submission as disabled). Shouldn't we need at least some of
> > that in the runtime suspend/resume scenario as well?
> 
> I was asking myself the same thing...

I take it back. on the current flow, if we have a chance of receiving a command
submission we shouldn't allow the runtime suspend. so it should be totally safe
to leave it there.

Given the latency data Riana shared with me offline we can see around 60ms of
latency saved here.

I was also wondering about the overall package power savings difference here.
Did you checked that?

Only thing we need to change is that now we have the pmu in the suspend/resume
path and we probably want that on runtime variants as well.

Anyway, let's move forward with this.

Acked-by: Rodrigo Vivi <rodrigo.vivi at intel.com>

> 
> > 
> > Also, which paths and how much we're actually looking to optimize? Are we
> > interested in optimizing the full suspend as well?
> > To elaborate, we're not actually killing the GuC in the suspend flow, we're
> > just resetting its SW state (i.e. putting it back as if it had just been
> > loaded); the actual reset happens in the resume path in xe_uc_init_hw. If
> > the GuC and the LMEM have survived the suspend/resume flow, we could
> 
> On S3/S2idle/S4, GuC and LMEM will lose power.
> 
> > theoretically just restart the SW side of things (re-register the CTBs and
> > start SLPC) in the resume path instead of resetting and reloading the FW;
> > this would be a lesser benefit to the runtime path compared to what you're
> > proposing, but it could benefit the full resume path as well and it'd have
> > the added benefit of keeping the 2 paths the same. I'm not sure though if
> > the LMEM management could end up breaking this approach by moving the memory
> > around during suspend.
> 
> this memory movement is indeed the hardest part with all the locking...
> 
> > 
> > 
> > A couple of minor comments inline
> > 
> > > ---
> > >   drivers/gpu/drm/xe/xe_gt.c | 56 ++++++++++++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/xe/xe_gt.h |  2 ++
> > >   drivers/gpu/drm/xe/xe_pm.c | 10 +++++--
> > >   drivers/gpu/drm/xe/xe_uc.c | 18 ++++++++++++
> > >   drivers/gpu/drm/xe/xe_uc.h |  2 ++
> > >   5 files changed, 85 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > > index 13320af4ddd3..0d52621ce64d 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > > @@ -676,6 +676,62 @@ int xe_gt_resume(struct xe_gt *gt)
> > >   	return err;
> > >   }
> > > +int xe_gt_runtime_suspend(struct xe_gt *gt)
> > > +{
> > > +	struct xe_device *xe = gt_to_xe(gt);
> > > +	int err;
> > > +
> > > +	if (xe->d3cold.allowed)
> > > +		return xe_gt_suspend(gt);
> > > +
> > > +	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > > +	if (err)
> > > +		return err;
> > > +
> > > +	err = xe_uc_disable_communication(&gt->uc);
> > > +	if (err)
> > > +		goto err_force_wake;
> > 
> > uc_stop() might already cover what's needed (or could be expanded to do so),
> > although unfortunately uc_start seems to not be matching and therefore not
> > usable as-is for the resume side.
> > 
> > > +
> > > +	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
> > > +	xe_gt_info(gt, "suspended\n");
> > > +
> > > +	return 0;
> > > +
> > > +err_force_wake:
> > > +	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
> > > +	xe_gt_err(gt, "suspend failed (%pe)\n", ERR_PTR(err));
> > > +
> > > +	return err;
> > > +}
> > > +
> > > +int xe_gt_runtime_resume(struct xe_gt *gt)
> > > +{
> > > +	struct xe_device *xe = gt_to_xe(gt);
> > > +	int err;
> > > +
> > > +	if (xe->d3cold.allowed)
> > > +		return xe_gt_resume(gt);
> > > +
> > > +	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > > +	if (err)
> > > +		return err;
> > > +
> > > +	err = xe_uc_resume(&gt->uc);
> > > +	if (err)
> > > +		goto err_force_wake;
> > > +
> > > +	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
> > > +	xe_gt_info(gt, "resumed\n");
> > > +
> > > +	return 0;
> > > +
> > > +err_force_wake:
> > > +	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
> > > +	xe_gt_err(gt, "resume failed (%pe)\n", ERR_PTR(err));
> > > +
> > > +	return err;
> > > +}
> > > +
> > >   struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
> > >   				     enum xe_engine_class class,
> > >   				     u16 instance, bool logical)
> > > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > > index caded203a8a0..e6574e51004f 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt.h
> > > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > > @@ -37,6 +37,8 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt);
> > >   void xe_gt_suspend_prepare(struct xe_gt *gt);
> > >   int xe_gt_suspend(struct xe_gt *gt);
> > >   int xe_gt_resume(struct xe_gt *gt);
> > > +int xe_gt_runtime_suspend(struct xe_gt *gt);
> > > +int xe_gt_runtime_resume(struct xe_gt *gt);
> > >   void xe_gt_reset_async(struct xe_gt *gt);
> > >   void xe_gt_sanitize(struct xe_gt *gt);
> > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > index 0f06d8304e17..6bc01bb45fc2 100644
> > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > @@ -245,7 +245,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> > >   	}
> > >   	for_each_gt(gt, xe, id) {
> > > -		err = xe_gt_suspend(gt);
> > > +		err = xe_gt_runtime_suspend(gt);
> > >   		if (err)
> > >   			goto out;
> > >   	}
> > > @@ -294,14 +294,18 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> > >   	xe_irq_resume(xe);
> > > -	for_each_gt(gt, xe, id)
> > > -		xe_gt_resume(gt);
> > > +	for_each_gt(gt, xe, id) {
> > > +		err = xe_gt_runtime_resume(gt);
> > > +		if (err)
> > > +			goto out;
> > > +	}
> > >   	if (xe->d3cold.allowed && xe->d3cold.power_lost) {
> > >   		err = xe_bo_restore_user(xe);
> > >   		if (err)
> > >   			goto out;
> > >   	}
> > > +
> > >   out:
> > >   	lock_map_release(&xe_device_mem_access_lockdep_map);
> > >   	xe_pm_write_callback_task(xe, NULL);
> > > diff --git a/drivers/gpu/drm/xe/xe_uc.c b/drivers/gpu/drm/xe/xe_uc.c
> > > index addd6f2681b9..b5b53c8c3edb 100644
> > > --- a/drivers/gpu/drm/xe/xe_uc.c
> > > +++ b/drivers/gpu/drm/xe/xe_uc.c
> > > @@ -216,6 +216,15 @@ static void uc_reset_wait(struct xe_uc *uc)
> > >   		goto again;
> > >   }
> > > +int xe_uc_disable_communication(struct xe_uc *uc)
> > > +{
> > > +	/* GuC submission not enabled, nothing to do */
> > > +	if (!xe_device_guc_submission_enabled(uc_to_xe(uc)))
> > > +		return 0;
> > > +
> > > +	return xe_guc_disable_communication(&uc->guc);
> > > +}
> > > +
> > >   int xe_uc_suspend(struct xe_uc *uc)
> > >   {
> > >   	int ret;
> > > @@ -232,3 +241,12 @@ int xe_uc_suspend(struct xe_uc *uc)
> > >   	return xe_guc_suspend(&uc->guc);
> > >   }
> > > +
> > > +int xe_uc_resume(struct xe_uc *uc)
> > 
> > This should be called runtime_resume.
> > 
> > Daniele
> > 
> > > +{
> > > +	/* GuC submission not enabled, nothing to do */
> > > +	if (!xe_device_guc_submission_enabled(uc_to_xe(uc)))
> > > +		return 0;
> > > +
> > > +	return xe_guc_enable_communication(&uc->guc);
> > > +}
> > > diff --git a/drivers/gpu/drm/xe/xe_uc.h b/drivers/gpu/drm/xe/xe_uc.h
> > > index 42219b361df5..29bd692d8800 100644
> > > --- a/drivers/gpu/drm/xe/xe_uc.h
> > > +++ b/drivers/gpu/drm/xe/xe_uc.h
> > > @@ -12,8 +12,10 @@ int xe_uc_init(struct xe_uc *uc);
> > >   int xe_uc_init_hwconfig(struct xe_uc *uc);
> > >   int xe_uc_init_post_hwconfig(struct xe_uc *uc);
> > >   int xe_uc_init_hw(struct xe_uc *uc);
> > > +int xe_uc_disable_communication(struct xe_uc *uc);
> > >   void xe_uc_gucrc_disable(struct xe_uc *uc);
> > >   int xe_uc_reset_prepare(struct xe_uc *uc);
> > > +int xe_uc_resume(struct xe_uc *uc);
> > >   void xe_uc_stop_prepare(struct xe_uc *uc);
> > >   int xe_uc_stop(struct xe_uc *uc);
> > >   int xe_uc_start(struct xe_uc *uc);
> >