[PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
Robin Murphy
robin.murphy at arm.com
Wed May 30 10:30:25 UTC 2018
On 30/05/18 09:03, Thierry Reding wrote:
> From: Thierry Reding <treding at nvidia.com>
>
> Depending on the kernel configuration, early ARM architecture setup code
> may have attached the GPU to a DMA/IOMMU mapping that transparently uses
> the IOMMU to back the DMA API. Tegra requires special handling for IOMMU
> backed buffers (a special bit in the GPU's MMU page tables indicates the
> memory path to take: via the SMMU or directly to the memory controller).
> Transparently backing DMA memory with an IOMMU prevents Nouveau from
> properly handling such memory accesses and causes memory access faults.
>
> As a side-note: buffers other than those allocated in instance memory
> don't need to be physically contiguous from the GPU's perspective since
> the GPU can map them into contiguous buffers using its own MMU. Mapping
> these buffers through the IOMMU is unnecessary and will even lead to
> performance degradation because of the additional translation. One
> exception to this are compressible buffers which need large pages. In
> order to enable these large pages, multiple small pages will have to be
> combined into one large (I/O virtually contiguous) mapping via the
> IOMMU. However, that is a topic outside the scope of this fix and isn't
> currently supported. An implementation will want to explicitly create
> these large pages in the Nouveau driver, so detaching from a DMA/IOMMU
> mapping would still be required.
I wonder if it might make sense to have a hook in iommu_attach_device()
to notify the arch DMA API code when moving devices between unmanaged
and DMA ops domains? That seems like it might be the most low-impact way
to address the overall problem long-term.
> Signed-off-by: Thierry Reding <treding at nvidia.com>
> ---
> Changes in v3:
> - clarify the use of IOMMU mapping for compressible buffers
> - squash multiple patches into this
>
> drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> index 78597da6313a..d0538af1b967 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> @@ -105,6 +105,11 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
> unsigned long pgsize_bitmap;
> int ret;
>
> +#if IS_ENABLED(CONFIG_ARM)
Wouldn't CONFIG_ARM_DMA_USE_IOMMU be even more appropriate?
> + /* make sure we can use the IOMMU exclusively */
> + arm_dma_iommu_detach_device(dev);
As before, I would just use the existing infrastructure the same way the
Exynos DRM driver currently does in __exynos_iommu_attach() (albeit
without then reattaching to another DMA ops mapping).
Robin.
> +#endif
> +
> if (!tdev->func->iommu_bit)
> return;
>
>
More information about the dri-devel
mailing list