[PATCH v2] drm/amd: Add pre-zen AMD hardware to PCIe dynamic switching exclusions

Alex Deucher alexdeucher at gmail.com
Thu Apr 3 15:48:31 UTC 2025


On Wed, Apr 2, 2025 at 11:12 PM Mario Limonciello <superm1 at kernel.org> wrote:
>
> From: Mario Limonciello <mario.limonciello at amd.com>
>
> AMD RX580 when added AMD Phenom 2 has problems with overheating. This is due to

I don't think this is entirely accurate.  I think the GPU gets hot
because the device hangs due to a problem with changing the PCIe
clocks.

> changes with PCIe dynamic switching introduced by commit 466a7d115326e
> ("drm/amd: Use the first non-dGPU PCI device for BW limits").
>
> To avoid risks of other issues with old hardware require at least Zen hardware
> for AMD side to enable PCIe dynamic switching.

I'm pretty sure PCIe reclocking worked on pre-Zen hardware.  We've
supported this on our GPUs going back at least 15 or more years.  I
suspect the actual problem is that some links may not reliably train
at the full bandwidth on some motherboards.  Forcing a higher link
speed may cause problems.  Maybe it would be better to limit the max
PCIe link rate to whatever the link is currently trained to.  IIRC,
PCIe links will train at the fastest link possible by default.  The
previous behavior was to limit the max clock to the slowest link in
the topology to save power, but then we changed it to use the fastest
link possible based on the PCIe link caps.  Perhaps limiting it to the
fastest currently trained link rate would be better.

Alex

>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4098
> Fixes: 466a7d115326e ("drm/amd: Use the first non-dGPU PCI device for BW limits")
> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> ---
> v2:
>  * Cover more hardware
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a30111d2c3ea0..caa44ee788c8f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1854,6 +1854,9 @@ bool amdgpu_device_seamless_boot_supported(struct amdgpu_device *adev)
>   *
>   * https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
>   * https://gitlab.freedesktop.org/drm/amd/-/issues/2663
> + *
> + * AMD Phenom II X6 1090T has a similar issue
> + * https://gitlab.freedesktop.org/drm/amd/-/issues/4098
>   */
>  static bool amdgpu_device_pcie_dynamic_switching_supported(struct amdgpu_device *adev)
>  {
> @@ -1866,6 +1869,8 @@ static bool amdgpu_device_pcie_dynamic_switching_supported(struct amdgpu_device
>
>         if (c->x86_vendor == X86_VENDOR_INTEL)
>                 return false;
> +       if (c->x86_vendor == X86_VENDOR_AMD && !cpu_feature_enabled(X86_FEATURE_ZEN))
> +               return false;
>  #endif
>         return true;
>  }
> --
> 2.43.0
>


More information about the amd-gfx mailing list