[PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

Limonciello, Mario Mario.Limonciello at amd.com
Tue May 17 14:06:48 UTC 2022


[Public]



> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Tuesday, May 17, 2022 08:43
> To: Limonciello, Mario <Mario.Limonciello at amd.com>
> Cc: amd-gfx list <amd-gfx at lists.freedesktop.org>
> Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to
> s2idle
> 
> On Tue, May 17, 2022 at 9:39 AM Mario Limonciello
> <mario.limonciello at amd.com> wrote:
> >
> > An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
> > reset for handling aborted suspend can't work with s2idle.
> >
> > This functionality was introduced in commit daf8de0874ab5b
> ("drm/amdgpu:
> > always reset the asic in suspend (v2)").  A few other commits have
> > gone on top of the ASIC reset, but this still doesn't work on the A+A
> > configuration in s2idle.
> >
> > Avoid doing the reset on dGPUs specifically when using s2idle.
> >
> > Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend
> (v2)")
> > Link:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F2008&data=05%7C01%7Cmario.limonciello%40amd.com%
> 7C38a880b8d03147194bb608da380b3142%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637883917968950850%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000%7C%7C%7C&sdata=lNeeucWFganu9GLey2YAqXfgbYT4DUBb
> fQA6XuwGslA%3D&reserved=0
> > Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> 
> Acked-by: Alex Deucher <alexander.deucher at amd.com>
> 
> But I think maybe we should just drop all of this in the longer term.
> We are slowly dropping every case where we reset the GPU.  I'm not
> even sure it actually fixes the issue it was originally added to fix
> at this point because the actual reset has moved to the no-irq
> callback which will most likely not get called on an aborted suspend
> anyway.
> 

So perhaps for now this patch and in the next cycle or two maybe folks who worked
on the aborted suspend case can try to trigger an aborted suspend w/ dGPUs + S3
and see whether it's actually working and tear everything out if it doesn't as you say.

> Alex
> 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h      |  1 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 14 ++++++++++++++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  2 +-
> >  3 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 3c20c2eadf4e..9cf3d65f17d7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1378,6 +1378,7 @@ static inline int
> amdgpu_acpi_smart_shift_update(struct drm_device *dev,
> >
> >  #if defined(CONFIG_ACPI) && defined(CONFIG_SUSPEND)
> >  bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev);
> > +bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev);
> >  bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev);
> >  #else
> >  static inline bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device
> *adev) { return false; }
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > index 0e12315fa0cb..98ac53ee6bb5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -1045,6 +1045,20 @@ bool amdgpu_acpi_is_s3_active(struct
> amdgpu_device *adev)
> >                 (pm_suspend_target_state == PM_SUSPEND_MEM);
> >  }
> >
> > +/**
> > + * amdgpu_acpi_should_gpu_reset
> > + *
> > + * @adev: amdgpu_device_pointer
> > + *
> > + * returns true if should reset GPU, false if not
> > + */
> > +bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev)
> > +{
> > +       if (adev->flags & AMD_IS_APU)
> > +               return false;
> > +       return pm_suspend_target_state != PM_SUSPEND_TO_IDLE;
> > +}
> > +
> >  /**
> >   * amdgpu_acpi_is_s0ix_active
> >   *
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 16871baee784..a84766c13ac5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -2315,7 +2315,7 @@ static int amdgpu_pmops_suspend_noirq(struct
> device *dev)
> >         struct drm_device *drm_dev = dev_get_drvdata(dev);
> >         struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >
> > -       if (!adev->in_s0ix)
> > +       if (amdgpu_acpi_should_gpu_reset(adev))
> >                 return amdgpu_asic_reset(adev);
> >
> >         return 0;
> > --
> > 2.34.1
> >


More information about the amd-gfx mailing list