[PATCH] drm/amdgpu: don't do resets on APUs which don't support it

Alex Deucher alexdeucher at gmail.com
Thu Jan 13 15:00:14 UTC 2022


On Thu, Jan 13, 2022 at 1:56 AM Lazar, Lijo <lijo.lazar at amd.com> wrote:
>
> Hi Alex,
>
> What about something like this?
>
> bool amdgpu_device_reset_on_suspend(struct amdgpu_device *adev)
> {
>          if (adev->in_s0ix || adev->gmc.xgmi.num_physical_nodes > 1)
>                  return false;
>
>          switch (amdgpu_asic_reset_method(adev)) {
>          case AMD_RESET_METHOD_BACO:
>          case AMD_RESET_METHOD_MODE1:
>          case AMD_RESET_METHOD_MODE2:

This should also work on AMD_RESET_METHOD_LEGACY, at least for dGPUs.
I think the current approach is probably better since I don't think
GPU resets work reliably on these chips anyway (it's not enabled by
default on them gor hangs), so better to just not do it as it may make
the problem worse.

Alex


>                  return true;
>          }
>
>          return false;
> }
>
> Thanks,
> Lijo
>
> On 1/13/2022 9:31 AM, Alex Deucher wrote:
> > It can cause a hang.  This is normally not enabled for GPU
> > hangs on these asics, but was recently enabled for handling
> > aborted suspends.  This causes hangs on some platforms
> > on suspend.
> >
> > Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)")
> > Cc: stable at vger.kernel.org
> > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858
> > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/cik.c | 4 ++++
> >   drivers/gpu/drm/amd/amdgpu/vi.c  | 4 ++++
> >   2 files changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
> > index 54f28c075f21..f10ce740a29c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/cik.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/cik.c
> > @@ -1428,6 +1428,10 @@ static int cik_asic_reset(struct amdgpu_device *adev)
> >   {
> >       int r;
> >
> > +     /* APUs don't have full asic reset */
> > +     if (adev->flags & AMD_IS_APU)
> > +             return 0;
> > +
> >       if (cik_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {
> >               dev_info(adev->dev, "BACO reset\n");
> >               r = amdgpu_dpm_baco_reset(adev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> > index fe9a7cc8d9eb..6645ebbd2696 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> > @@ -956,6 +956,10 @@ static int vi_asic_reset(struct amdgpu_device *adev)
> >   {
> >       int r;
> >
> > +     /* APUs don't have full asic reset */
> > +     if (adev->flags & AMD_IS_APU)
> > +             return 0;
> > +
> >       if (vi_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {
> >               dev_info(adev->dev, "BACO reset\n");
> >               r = amdgpu_dpm_baco_reset(adev);
> >


More information about the amd-gfx mailing list