[PATCH] drm/amdgpu: support new mode-1 reset interface

Zhou1, Tao Tao.Zhou1 at amd.com
Tue Nov 16 08:47:25 UTC 2021


[AMD Official Use Only]

Hi Lijo,

Your concern is reasonable, but in fact smu_v13_0_mode1_reset is used only by ALDEBARAN currently. I assume the PMFW of new smu v13 ASIC in the future will follow this design, otherwise we could move the implementation into xxx_ppt.c.

Regards,
Tao

> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Tuesday, November 16, 2021 3:44 PM
> To: Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org; Zhang,
> Hawking <Hawking.Zhang at amd.com>; Clements, John
> <John.Clements at amd.com>; Yang, Stanley <Stanley.Yang at amd.com>; Quan,
> Evan <Evan.Quan at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: support new mode-1 reset interface
>
>
>
> On 11/16/2021 12:53 PM, Tao Zhou wrote:
> > If gpu reset is triggered by ras fatal error, tell it to smu in mode-1
> > reset message.
> >
> > Signed-off-by: Tao Zhou <tao.zhou1 at amd.com>
> > ---
> >   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c    | 21
> ++++++++++++++++---
> >   1 file changed, 18 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > index 35145db6eedf..6f3d064a8232 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > @@ -1426,16 +1426,31 @@ int smu_v13_0_set_azalia_d3_pme(struct
> > smu_context *smu)
> >
> >   int smu_v13_0_mode1_reset(struct smu_context *smu)
> >   {
> > -   u32 smu_version;
> > +   u32 smu_version, fatal_err, param;
> >     int ret = 0;
> > +   struct amdgpu_device *adev = smu->adev;
> > +   struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> > +
> > +   fatal_err = 0;
> > +   param = SMU_RESET_MODE_1;
> > +
> >     /*
> >     * PM FW support SMU_MSG_GfxDeviceDriverReset from 68.07
> >     */
> >     smu_cmn_get_smc_version(smu, NULL, &smu_version);
> >     if (smu_version < 0x00440700)
> >             ret = smu_cmn_send_smc_msg(smu, SMU_MSG_Mode1Reset,
> NULL);
> > -   else
> > -           ret = smu_cmn_send_smc_msg_with_param(smu,
> SMU_MSG_GfxDeviceDriverReset, SMU_RESET_MODE_1, NULL);
> > +   else {
> > +           /* fatal error triggered by ras, PMFW supports the flag
> > +              from 68.44.0 */
> > +           if ((smu_version >= 0x00442c00) && ras &&
> > +               atomic_read(&ras->in_recovery))
> > +                   fatal_err = 1;
> > +
>
>  From PMFW version, this looks specific to aldebaran. Since there is version
> check as well, the implementation needs to be moved to aldebaran_ppt.c
>
> Thanks,
> Lijo
>
> > +           param |= (fatal_err << 16);
> > +           ret = smu_cmn_send_smc_msg_with_param(smu,
> > +                                   SMU_MSG_GfxDeviceDriverReset,
> param, NULL);
> > +   }
> >
> >     if (!ret)
> >             msleep(SMU13_MODE1_RESET_WAIT_TIME_IN_MS);
> >


More information about the amd-gfx mailing list