[PATCH V2 2/2] drm/amdgpu: Permit PCIe transfer over links with XGMI

Felix Kuehling felix.kuehling at amd.com
Mon Oct 16 22:27:38 UTC 2023


On 2023-10-16 10:49, David Francis wrote:
> When the CPU is XGMI connected, the PCIe links should
> not be enumerated for topology purposes. However, PCIe
> transfer should still be a valid option for remote
> doorbells and MMIO mappings.
>
> Move the XGMI connection check out of the shared helper
> function amdgpu_device_is_peer_accessible and into the
> topology path.
>
> Signed-off-by: David Francis <David.Francis at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +---
>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c  | 6 ++++--
>   2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index bad2b5577e96..b47cb7f8cfbd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5753,9 +5753,7 @@ bool amdgpu_device_is_peer_accessible(struct amdgpu_device *adev,
>   		~*peer_adev->dev->dma_mask : ~((1ULL << 32) - 1);
>   	resource_size_t aper_limit =
>   		adev->gmc.aper_base + adev->gmc.aper_size - 1;
> -	bool p2p_access =
> -		!adev->gmc.xgmi.connected_to_cpu &&
> -		!(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
> +	bool p2p_access = !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
>   
>   	return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
>   		adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 4e530791507e..cb64c19482f3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1514,11 +1514,13 @@ static int kfd_dev_create_p2p_links(void)
>   			goto next;
>   
>   		/* check if node(s) is/are peer accessible in one direction or bi-direction */
> -		ret = kfd_add_peer_prop(new_dev, dev, i, k);
> +		if (!new_dev->gpu->adev->gmc.xgmi.connected_to_cpu)

Yikes. I was thinking this should be something like

	if (... xgmi.connected_to_cpu)
		goto next;

I didn't consider that the check needs to be separate for each GPU. I 
mean, it's not exactly the same thing, but would it make sense to have a 
check like this?

	if (new_dev->gpu->adev->gmc.xgmi.connected_to_cpu &&
	    dev->gpu->adev->gmc.xgmi.connected_to_cpu)
		goto next;

I don't see why this should depend on the direction of the link. We 
don't want to advertise PCI P2P links between pairs of GPUs that are 
both connected to the CPU via XGMI. We don't currently support mixed 
systems with both XGMI and PCIe connected GPUs. But if that ever 
existed, I think we would want to allow P2P links between those, 
regardless of the direction.

Regards,
   Felix


> +			ret = kfd_add_peer_prop(new_dev, dev, i, k);
>   		if (ret < 0)
>   			goto out;
>   
> -		ret = kfd_add_peer_prop(dev, new_dev, k, i);
> +		if (!dev->gpu->adev->gmc.xgmi.connected_to_cpu)
> +			ret = kfd_add_peer_prop(dev, new_dev, k, i);
>   		if (ret < 0)
>   			goto out;
>   next:


More information about the amd-gfx mailing list