[PATCH] drm/amdgpu: disable cstate properly for driver reloading scenario
Quan, Evan
Evan.Quan at amd.com
Thu Mar 2 09:30:44 UTC 2023
[AMD Official Use Only - General]
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Thursday, March 2, 2023 5:21 PM
> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: disable cstate properly for driver
> reloading scenario
>
>
>
> On 3/2/2023 2:43 PM, Quan, Evan wrote:
> > [AMD Official Use Only - General]
> >
> >
> >
> >> -----Original Message-----
> >> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> >> Sent: Thursday, March 2, 2023 4:28 PM
> >> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
> >> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
> >> Subject: Re: [PATCH] drm/amdgpu: disable cstate properly for driver
> >> reloading scenario
> >>
> >>
> >>
> >> On 3/2/2023 12:28 PM, Evan Quan wrote:
> >>> Gpu reset might be needed during driver reloading. To guard that(gpu
> >>> reset) work, df cstate needs to be disabled properly.
> >>>
> >>> Signed-off-by: Evan Quan <evan.quan at amd.com>
> >>> Change-Id: I5c074c265c0b08a67b6934ae1ad9aa3fed245461
> >>> ---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++++++
> >>> 1 file changed, 9 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> index 51bbeaa1f311..3c854461ef32 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> @@ -2816,6 +2816,15 @@ static int amdgpu_device_ip_fini_early(struct
> >> amdgpu_device *adev)
> >>> amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> >>> amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> >>>
> >>> + /*
> >>> + * Get df cstate disabled properly on driver unloading.
> >>> + * Since on the succeeding driver reloading, gpu reset might
> >>> + * be required. And cstate disabled is a prerequisite for
> >>> + * that(gpu reset).
> >>> + */
> >>> + if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW))
> >>> + dev_warn(adev->dev, "Failed to disallow df cstate");
> >>> +
> >>
> >> This looks more like a firmware bug. Driver sends the Unload message to
> FW.
> >> In that case FW should disable all features including C-state.
> > Driver does not send the Unload message. We want PMFM alive and ready
> for handling possible gpu reset on reloading.
> >
>
> Actually, soc21_need_reset_on_init code itself has a bug. PSP won't get
> unloaded by default on ring destruction. Even if PSP stops, it could just keep
> the heartbeat value as non-zero (just that it won't increment).
>
> Probably, that needs to be fixed first rather than keeping PMFW alive for a
> reset.
As I remembered, the change(asic reset during reloading) seemed introduced to address some sriov issues.
@Deucher, Alexander might share more backgrounds about this.
To be honest, I'm not a fan of this(perform asic reset during reloading).
Evan
>
> Thanks,
> Lijo
>
> > BR
> > Evan
> >>
> >> Thanks,
> >> Lijo
> >>
> >>> amdgpu_amdkfd_suspend(adev, false);
> >>>
> >>> /* Workaroud for ASICs need to disable SMC first */
More information about the amd-gfx
mailing list