[PATCH v2 1/2] drm/amd: Use the first non-dGPU PCI device for BW limits

Lazar, Lijo lijo.lazar at amd.com
Wed Nov 15 11:55:21 UTC 2023



On 11/11/2023 4:04 AM, Mario Limonciello wrote:
> When bandwidth limits are looked up using pcie_bandwidth_available()
> virtual links such as USB4 are analyzed which might not represent the
> real speed. Furthermore devices may change speeds autonomously which
> may introduce conditional variation to the results reported in the
> status registers.
> 
> Instead look at the capabilities of first PCI device outside of
> dGPU to decide upper limits that the dGPU will work at.
> 
> For eGPU this effectively means that it will use the speed of the link
> partner.  As the new semenatics of this are unique to AMD dGPUs, create
> a new local symbol instead of changing pcie_bandwidth_available().

The last line may be removed. As discussed in the thread with Mike, 
looking at link partner's capabilities is the right thing to do 
regardless of the issue. pcie_bandwidth_available() api doesn't need to 
be blamed :)

Series is -
	Reviewed-by: Lijo Lazar <lijo.lazar at amd.com>

Thanks,
Lijo

> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
> Link: https://www.usb.org/document-library/usb4r-specification-v20
>        USB4 V2 with Errata and ECN through June 2023
>        Section 11.2.1
> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 37 ++++++++++++++++++++--
>   1 file changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 1fc73bb4ec73..683ea2284827 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5721,6 +5721,39 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   	return r;
>   }
>   
> +/**
> + * amdgpu_device_partner_bandwidth - find the bandwidth of appropriate partner
> + *
> + * @adev: amdgpu_device pointer
> + * @speed: pointer to the speed of the link
> + * @width: pointer to the width of the link
> + *
> + * Evaluate the hierarchy to find the speed and bandwidth capabilities of the
> + * first physical partner to an AMD dGPU.
> + * This will exclude any virtual switches and links.
> + */
> +static void amdgpu_device_partner_bandwidth(struct amdgpu_device *adev,
> +					    enum pci_bus_speed *speed,
> +					    enum pcie_link_width *width)
> +{
> +	struct pci_dev *parent = adev->pdev;
> +
> +	if (!speed || !width)
> +		return;
> +
> +	*speed = PCI_SPEED_UNKNOWN;
> +	*width = PCIE_LNK_WIDTH_UNKNOWN;
> +
> +	while ((parent = pci_upstream_bridge(parent))) {
> +		/* skip upstream/downstream switches internal to dGPU*/
> +		if (parent->vendor == PCI_VENDOR_ID_ATI)
> +			continue;
> +		*speed = pcie_get_speed_cap(parent);
> +		*width = pcie_get_width_cap(parent);
> +		break;
> +	}
> +}
> +
>   /**
>    * amdgpu_device_get_pcie_info - fence pcie info about the PCIE slot
>    *
> @@ -5754,8 +5787,8 @@ static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev)
>   	if (adev->pm.pcie_gen_mask && adev->pm.pcie_mlw_mask)
>   		return;
>   
> -	pcie_bandwidth_available(adev->pdev, NULL,
> -				 &platform_speed_cap, &platform_link_width);
> +	amdgpu_device_partner_bandwidth(adev, &platform_speed_cap,
> +					&platform_link_width);
>   
>   	if (adev->pm.pcie_gen_mask == 0) {
>   		/* asic caps */


More information about the amd-gfx mailing list