iommu/amd: fix the address translation issue when do detach
Mario Limonciello
mario.limonciello at amd.com
Fri Jul 28 01:46:29 UTC 2023
On 7/27/23 04:55, Jesse Zhang wrote:
> From: Jesse Zhang <jesse.zhang at amd.com>
>
> iGpu driver fail to read/write register by iommu when start X.
> kernel: [ 433.296634] audit: type=1400 audit(1690403823.130:64): apparmor="DENIED" operation="capable" class="cap"
> profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=12344 comm="snap-confine" capability=38 capname="perfmon"
> kernel: [ 433.515795] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
> kernel: [ 440.195492] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
> kernel: [ 453.679611] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
> kernel: [ 460.383490] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2659
>
> Disable address translation service, before detach device.
> Do detach will clear the page table point or pasid table entries,
> so all DMA requests from the device should be blocked before that.
>
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> ---
> drivers/iommu/amd/iommu.c | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
The reporter came back and indicated this worked, so here are some tags
for it.
Fixes: 8dc1db3172ae ("drm/amdkfd: Introduce kfd_node struct (v5)")
Tested-by: Mike Lothian <mike at fireburn.co.uk>
This commit that introduced the problem is in 6.5-rc1, so hopefully this
can be queued up for a future 6.5-rc.
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index dc1ec6849775..6a2237bfdcb9 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1863,17 +1863,20 @@ static void detach_device(struct device *dev)
> if (WARN_ON(!dev_data->domain))
> goto out;
>
> - do_detach(dev_data);
> -
> - if (!dev_is_pci(dev))
> - goto out;
> + /* Disable address translation service, before detach device.
> + * Do detach will clear the page table point or pasid table entries,
> + * so all DMA requests from the device should be blocked before that.
> + */
> + if (dev_is_pci(dev)) {
> + if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
> + pdev_iommuv2_disable(to_pci_dev(dev));
> + else if (dev_data->ats.enabled)
> + pci_disable_ats(to_pci_dev(dev));
>
> - if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
> - pdev_iommuv2_disable(to_pci_dev(dev));
> - else if (dev_data->ats.enabled)
> - pci_disable_ats(to_pci_dev(dev));
> + dev_data->ats.enabled = false;
> + }
>
> - dev_data->ats.enabled = false;
> + do_detach(dev_data);
>
> out:
> spin_unlock(&dev_data->lock);
More information about the amd-gfx
mailing list