[PATCH] drm/amdgpu: Don't enable LTR if not supported

Lazar, Lijo Lijo.Lazar at amd.com
Thu Sep 8 16:42:38 UTC 2022


[AMD Official Use Only - General]

I am not sure if ASPM settings can be generalized by PCIE core. Performance vs Power savings when ASPM is enabled will require some additional tuning and that will be device specific.

In some of the other ASICs, this programming is done in VBIOS/SBIOS firmware. Having it in driver provides the advantage of additional tuning without forcing a VBIOS upgrade.

Thanks,
Lijo
________________________________
From: Alex Deucher <alexdeucher at gmail.com>
Sent: Thursday, September 8, 2022 9:55:41 PM
To: Bjorn Helgaas <helgaas at kernel.org>
Cc: Lazar, Lijo <Lijo.Lazar at amd.com>; stable at vger.kernel.org <stable at vger.kernel.org>; wielkiegie at gmail.com <wielkiegie at gmail.com>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; Deucher, Alexander <Alexander.Deucher at amd.com>; Quan, Evan <Evan.Quan at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>
Subject: Re: [PATCH] drm/amdgpu: Don't enable LTR if not supported

On Thu, Sep 8, 2022 at 12:12 PM Bjorn Helgaas <helgaas at kernel.org> wrote:
>
> [+cc Evan, author of 62f8f5c3bfc2 ("drm/amdgpu: enable ASPM support
> for PCIE 7.4.0/7.6.0")]
>
> On Thu, Sep 08, 2022 at 08:53:44AM +0530, Lijo Lazar wrote:
> > As per PCIE Base Spec r4.0 Section 6.18
> > 'Software must not enable LTR in an Endpoint unless the Root Complex
> > and all intermediate Switches indicate support for LTR.'
> >
> > This fixes the Unsupported Request error reported through AER during
> > ASPM enablement.
> >
> > Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216455&data=05%7C01%7Clijo.lazar%40amd.com%7C1b7a9338c69d494ff04508da91b6cd07%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637982511562535915%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Vmk44BHDQGoZTq%2FGez8Kh7rrp%2Boz%2BQYEHGRaHMSyUOU%3D&reserved=0
> >
> > The error was unnoticed before and got visible because of the commit
> > referenced below. This doesn't fix anything in the commit below, rather
> > fixes the issue in amdgpu exposed by the commit. The reference is only
> > to associate this commit with below one so that both go together.
> >
> > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> >
> > Reported-by: Gustaw Smolarczyk <wielkiegie at gmail.com>
> > Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
> > Cc: stable at vger.kernel.org
> > ---
> >  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 9 ++++++++-
> >  drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 9 ++++++++-
> >  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 9 ++++++++-
>
> nbio_v4_3_program_ltr() checks pdev->ltr_path itself instead of doing
> it in *_program_aspm().  It'd be nice to use the same approach for all
> versions.
>
> I really don't like the fact that amdgpu does all this ASPM fiddling
> in the driver in the first place.  ASPM should be configured by the
> PCI core, not by each individual driver.  ASPM has all sorts of
> requirements that relate to upstream devices, which I think amdgpu
> ignores, but the core pays attention to.
>
> Do you know why the driver configures ASPM itself?  If the PCI core is
> doing something wrong (and I'm sure it is, ASPM support is kind of a
> mess), I'd much prefer to fix up the core where *all* drivers can
> benefit from it.

This is the programming sequence we get from our hardware team and it
is used on both windows and Linux.  As far as I understand it windows
doesn't handle this in the core, it's up to the individual drivers to
enable it.  I'm not familiar with how this should be enabled
generically, but at least for our hardware, it seems to have some
variation compared to what is done in the PCI core due to stability,
etc. It seems to me that this may need asic specific implementations
for a lot of hardware depending on the required programming sequences.
E.g., various asics may need hardware workaround for bugs or platform
issues, etc.  I can ask for more details from our hardware team.

Alex

>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > index b465baa26762..aa761ff3a5fa 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > @@ -380,6 +380,7 @@ static void nbio_v2_3_enable_aspm(struct amdgpu_device *adev,
> >               WREG32_PCIE(smnPCIE_LC_CNTL, data);
> >  }
> >
> > +#ifdef CONFIG_PCIEASPM
> >  static void nbio_v2_3_program_ltr(struct amdgpu_device *adev)
> >  {
> >       uint32_t def, data;
> > @@ -401,9 +402,11 @@ static void nbio_v2_3_program_ltr(struct amdgpu_device *adev)
> >       if (def != data)
> >               WREG32_PCIE(smnBIF_CFG_DEV0_EPF0_DEVICE_CNTL2, data);
> >  }
> > +#endif
> >
> >  static void nbio_v2_3_program_aspm(struct amdgpu_device *adev)
> >  {
> > +#ifdef CONFIG_PCIEASPM
> >       uint32_t def, data;
> >
> >       def = data = RREG32_PCIE(smnPCIE_LC_CNTL);
> > @@ -459,7 +462,10 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device *adev)
> >       if (def != data)
> >               WREG32_PCIE(smnPCIE_LC_CNTL6, data);
> >
> > -     nbio_v2_3_program_ltr(adev);
> > +     /* Don't bother about LTR if LTR is not enabled
> > +      * in the path */
> > +     if (adev->pdev->ltr_path)
> > +             nbio_v2_3_program_ltr(adev);
> >
> >       def = data = RREG32_SOC15(NBIO, 0, mmRCC_BIF_STRAP3);
> >       data |= 0x5DE0 << RCC_BIF_STRAP3__STRAP_VLINK_ASPM_IDLE_TIMER__SHIFT;
> > @@ -483,6 +489,7 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device *adev)
> >       data &= ~PCIE_LC_CNTL3__LC_DSC_DONT_ENTER_L23_AFTER_PME_ACK_MASK;
> >       if (def != data)
> >               WREG32_PCIE(smnPCIE_LC_CNTL3, data);
> > +#endif
> >  }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220908/c72eec1a/attachment.htm>


More information about the amd-gfx mailing list