[PATCH] drm/amdgpu: Ensure HDA function is suspended before ASIC reset

Limonciello, Mario Mario.Limonciello at amd.com
Thu Apr 7 18:11:42 UTC 2022


[Public]

> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Thursday, April 7, 2022 12:08
> To: Kai-Heng Feng <kai.heng.feng at canonical.com>
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>; David
> Airlie <airlied at linux.ie>; Maling list - DRI developers <dri-
> devel at lists.freedesktop.org>; LKML <linux-kernel at vger.kernel.org>; amd-
> gfx list <amd-gfx at lists.freedesktop.org>; Chiu, Solomon
> <Solomon.Chiu at amd.com>; Limonciello, Mario
> <Mario.Limonciello at amd.com>; Quan, Evan <Evan.Quan at amd.com>;
> Tuikov, Luben <Luben.Tuikov at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: Ensure HDA function is suspended
> before ASIC reset
> 
> On Thu, Apr 7, 2022 at 8:21 AM Kai-Heng Feng
> <kai.heng.feng at canonical.com> wrote:
> >
> > DP/HDMI audio on AMD PRO VII stops working after S3:
> > [  149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset
> > [  149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset
> > [  149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset
> > [  149.983693] snd_hda_intel 0000:63:00.1: refused to change power state
> from D0 to D3hot
> > [  150.003439] amdgpu 0000:63:00.0: refused to change power state from
> D0 to D3hot
> > ...
> > [  155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP
> = 65535
> 
> As an aside, shouldn't device links order this properly already?  I
> thought that was the whole point of them.  We have quirks in PCI
> quirks.c to create device links for all GPU integrated peripherals
> (audio, usb, ucsi).

The movement from D0->D3 only happens in noirq though in pci_pm_suspend_noirq.
So if device links only order within a suspend phase then resetting during that
phase means the future phase won't have ground to stand on.

IOW I think device links + this commit are needed to get the order right with a reset.

> 
> Alex
> 
> >
> > The offending commit is daf8de0874ab5b ("drm/amdgpu: always reset the
> asic in
> > suspend (v2)"). Commit 34452ac3038a7 ("drm/amdgpu: don't use BACO for
> > reset in S3 ") doesn't help, so the issue is something different.
> >
> > Assuming that to make HDA resume to D0 fully realized, it needs to be
> > successfully put to D3 first. And this guesswork proves working, by
> > moving amdgpu_asic_reset() to noirq callback, so it's called after HDA
> > function is in D3.
> >
> > Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend
> (v2)")
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng at canonical.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18 ++++++++++++------
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index bb1c025d90019..31f7229e7ea89 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -2323,18 +2323,23 @@ static int amdgpu_pmops_suspend(struct
> device *dev)
> >  {
> >         struct drm_device *drm_dev = dev_get_drvdata(dev);
> >         struct amdgpu_device *adev = drm_to_adev(drm_dev);
> > -       int r;
> >
> >         if (amdgpu_acpi_is_s0ix_active(adev))
> >                 adev->in_s0ix = true;
> >         else
> >                 adev->in_s3 = true;
> > -       r = amdgpu_device_suspend(drm_dev, true);
> > -       if (r)
> > -               return r;
> > +       return amdgpu_device_suspend(drm_dev, true);
> > +}
> > +
> > +static int amdgpu_pmops_suspend_noirq(struct device *dev)
> > +{
> > +       struct drm_device *drm_dev = dev_get_drvdata(dev);
> > +       struct amdgpu_device *adev = drm_to_adev(drm_dev);
> > +
> >         if (!adev->in_s0ix)
> > -               r = amdgpu_asic_reset(adev);
> > -       return r;
> > +               return amdgpu_asic_reset(adev);
> > +
> > +       return 0;
> >  }
> >
> >  static int amdgpu_pmops_resume(struct device *dev)
> > @@ -2575,6 +2580,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
> = {
> >         .prepare = amdgpu_pmops_prepare,
> >         .complete = amdgpu_pmops_complete,
> >         .suspend = amdgpu_pmops_suspend,
> > +       .suspend_noirq = amdgpu_pmops_suspend_noirq,
> >         .resume = amdgpu_pmops_resume,
> >         .freeze = amdgpu_pmops_freeze,
> >         .thaw = amdgpu_pmops_thaw,
> > --
> > 2.34.1
> >


More information about the amd-gfx mailing list