iommu/amd: fix the address translation issue when do detach

Mario Limonciello mario.limonciello at amd.com
Fri Jul 28 01:46:29 UTC 2023


On 7/27/23 04:55, Jesse Zhang wrote:
> From: Jesse Zhang <jesse.zhang at amd.com>
> 
> iGpu driver fail to read/write register by iommu when start X.
> kernel: [  433.296634] audit: type=1400 audit(1690403823.130:64): apparmor="DENIED" operation="capable" class="cap"
> profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=12344 comm="snap-confine" capability=38  capname="perfmon"
> kernel: [  433.515795] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
> kernel: [  440.195492] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
> kernel: [  453.679611] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
> kernel: [  460.383490] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2659
> 
> Disable address translation service, before detach device.
> Do detach will clear the page table point or pasid table entries,
> so all DMA requests from the device should be blocked before that.
> 
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> ---
>   drivers/iommu/amd/iommu.c | 21 ++++++++++++---------
>   1 file changed, 12 insertions(+), 9 deletions(-)

The reporter came back and indicated this worked, so here are some tags 
for it.

Fixes: 8dc1db3172ae ("drm/amdkfd: Introduce kfd_node struct (v5)")
Tested-by: Mike Lothian <mike at fireburn.co.uk>

This commit that introduced the problem is in 6.5-rc1, so hopefully this 
can be queued up for a future 6.5-rc.

> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index dc1ec6849775..6a2237bfdcb9 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1863,17 +1863,20 @@ static void detach_device(struct device *dev)
>   	if (WARN_ON(!dev_data->domain))
>   		goto out;
>   
> -	do_detach(dev_data);
> -
> -	if (!dev_is_pci(dev))
> -		goto out;
> +        /* Disable address translation service, before detach device.
> +        *  Do detach will clear the page table point or pasid table entries,
> +        *  so all DMA requests from the device should be blocked before that.
> +        */
> +	if (dev_is_pci(dev)) {
> +		if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
> +			pdev_iommuv2_disable(to_pci_dev(dev));
> +		else if (dev_data->ats.enabled)
> +			pci_disable_ats(to_pci_dev(dev));
>   
> -	if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
> -		pdev_iommuv2_disable(to_pci_dev(dev));
> -	else if (dev_data->ats.enabled)
> -		pci_disable_ats(to_pci_dev(dev));
> +		dev_data->ats.enabled = false;
> +	}
>   
> -	dev_data->ats.enabled = false;
> +	do_detach(dev_data);
>   
>   out:
>   	spin_unlock(&dev_data->lock);



More information about the amd-gfx mailing list