[PATCH AUTOSEL 5.15 3/5] drm/amdgpu: Enable gpu reset for S3 abort cases on Raven series

Alex Deucher alexdeucher at gmail.com
Wed Mar 13 20:46:04 UTC 2024


On Wed, Mar 13, 2024 at 4:12 PM Felix Kuehling <felix.kuehling at amd.com> wrote:
>
> On 2024-03-11 11:14, Sasha Levin wrote:
> > From: Prike Liang <Prike.Liang at amd.com>
> >
> > [ Upstream commit c671ec01311b4744b377f98b0b4c6d033fe569b3 ]
> >
> > Currently, GPU resets can now be performed successfully on the Raven
> > series. While GPU reset is required for the S3 suspend abort case.
> > So now can enable gpu reset for S3 abort cases on the Raven series.
>
> This looks suspicious to me. I'm not sure what conditions made the GPU
> reset successful. But unless all the changes involved were also
> backported, this should probably not be applied to older kernel
> branches. I'm speculating it may be related to the removal of AMD IOMMUv2.
>

We should get confirmation from Prike, but I think he tested this on
older kernels as well.

Alex

> Regards,
>    Felix
>
>
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > Acked-by: Alex Deucher <alexander.deucher at amd.com>
> > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> > Signed-off-by: Sasha Levin <sashal at kernel.org>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/soc15.c | 45 +++++++++++++++++-------------
> >   1 file changed, 25 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 6a3486f52d698..ef5b3eedc8615 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -605,11 +605,34 @@ soc15_asic_reset_method(struct amdgpu_device *adev)
> >               return AMD_RESET_METHOD_MODE1;
> >   }
> >
> > +static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)
> > +{
> > +     u32 sol_reg;
> > +
> > +     sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> > +
> > +     /* Will reset for the following suspend abort cases.
> > +      * 1) Only reset limit on APU side, dGPU hasn't checked yet.
> > +      * 2) S3 suspend abort and TOS already launched.
> > +      */
> > +     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > +                     !adev->suspend_complete &&
> > +                     sol_reg)
> > +             return true;
> > +
> > +     return false;
> > +}
> > +
> >   static int soc15_asic_reset(struct amdgpu_device *adev)
> >   {
> >       /* original raven doesn't have full asic reset */
> > -     if ((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > -         (adev->apu_flags & AMD_APU_IS_RAVEN2))
> > +     /* On the latest Raven, the GPU reset can be performed
> > +      * successfully. So now, temporarily enable it for the
> > +      * S3 suspend abort case.
> > +      */
> > +     if (((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > +         (adev->apu_flags & AMD_APU_IS_RAVEN2)) &&
> > +             !soc15_need_reset_on_resume(adev))
> >               return 0;
> >
> >       switch (soc15_asic_reset_method(adev)) {
> > @@ -1490,24 +1513,6 @@ static int soc15_common_suspend(void *handle)
> >       return soc15_common_hw_fini(adev);
> >   }
> >
> > -static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)
> > -{
> > -     u32 sol_reg;
> > -
> > -     sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> > -
> > -     /* Will reset for the following suspend abort cases.
> > -      * 1) Only reset limit on APU side, dGPU hasn't checked yet.
> > -      * 2) S3 suspend abort and TOS already launched.
> > -      */
> > -     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > -                     !adev->suspend_complete &&
> > -                     sol_reg)
> > -             return true;
> > -
> > -     return false;
> > -}
> > -
> >   static int soc15_common_resume(void *handle)
> >   {
> >       struct amdgpu_device *adev = (struct amdgpu_device *)handle;


More information about the amd-gfx mailing list