[RFC] drm/ttm: dma: Fixes for 32-bit and 64-bit ARM

Alexandre Courbot acourbot at nvidia.com
Sun Dec 7 23:14:47 PST 2014


On 11/12/2014 09:39 PM, Thierry Reding wrote:
> From: Thierry Reding <treding at nvidia.com>
>
> dma_alloc_coherent() returns a kernel virtual address that is part of
> the linear range. Passing such an address to virt_to_page() is illegal
> on non-coherent architectures. This causes the kernel to oops on 64-bit
> ARM because the struct page * obtained from virt_to_page() points to
> unmapped memory.
>
> This commit fixes this by using phys_to_page() since we get a physical
> address from dma_alloc_coherent(). Note that this is not a proper fix
> because if an IOMMU is set up to translate addresses for the GPU this
> address will be an I/O virtual address rather than a physical one. The
> proper fix probably involves not getting a pointer to the struct page
> in the first place, but that would be a much more intrusive change, if
> at all possible.
>
> Until that time, this temporary fix will allow TTM to work on 32-bit
> and 64-bit ARM as well, provided that no IOMMU translations are enabled
> for the GPU.
>
> Signed-off-by: Thierry Reding <treding at nvidia.com>
> ---
> Arnd, I realize that this isn't a proper fix according to what we discussed on
> IRC yesterday, but I can't see a way to remove access to the pages array that
> would be as simple as this. I've marked this as RFC in the hope that it will
> trigger some discussion that will lead to a proper solution.
>
>   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index c96db433f8af..d7993985752c 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -343,7 +343,11 @@ static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
>   					   &d_page->dma,
>   					   pool->gfp_flags);
>   	if (d_page->vaddr)
> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> +		d_page->p = phys_to_page(d_page->dma);
> +#else
>   		d_page->p = virt_to_page(d_page->vaddr);
> +#endif

Since I am messing with the IOMMU I just happened to have hit the issue 
you are mentioning. Wouldn't the following work:

-       if (d_page->vaddr)
-#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
-               d_page->p = phys_to_page(d_page->dma);
-#else
-               d_page->p = virt_to_page(d_page->vaddr);
-#endif
-       else {
+       if (d_page->vaddr) {
+               if (is_vmalloc_addr(d_page->vaddr)) {
+                       d_page->p = vmalloc_to_page(d_page->vaddr);
+               } else {
+                       d_page->p = virt_to_page(d_page->vaddr);
+               }
+       } else {

A remapped page will end up in the vmalloc range of the address space, 
and in this case we can use vmalloc_to_page() to get the right page. 
Pages outside of this range are part of the linear mapping and can be 
resolved using virt_to_page().

Jetson seems to be mostly happy with this, although I sometimes get the 
following trace:

[   13.174763] kernel BUG at ../mm/slab.c:2593!
[   13.174767] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[   13.174790] Modules linked in: nouveau_platform(O+) nouveau(O) 
cfbfillrect cfbimgblt cfbcopyarea ttm
...
[   13.175234] [<c00de238>] (cache_alloc_refill) from [<c00de528>] 
(__kmalloc+0x100/0x13c)
[   13.175247] [<c00de528>] (__kmalloc) from [<c001d564>] 
(arm_iommu_alloc_attrs+0x94/0x3a8)
[   13.175269] [<c001d564>] (arm_iommu_alloc_attrs) from [<bf008f4c>] 
(ttm_dma_populate+0x498/0x76c [ttm])
[   13.175294] [<bf008f4c>] (ttm_dma_populate [ttm]) from [<bf000bb8>] 
(ttm_tt_bind+0x38/0x68 [ttm])
[   13.175315] [<bf000bb8>] (ttm_tt_bind [ttm]) from [<bf00298c>] 
(ttm_bo_handle_move_mem+0x408/0x47c [ttm])
[   13.175337] [<bf00298c>] (ttm_bo_handle_move_mem [ttm]) from 
[<bf003758>] (ttm_bo_validate+0x220/0x22c [ttm])
[   13.175359] [<bf003758>] (ttm_bo_validate [ttm]) from [<bf003984>] 
(ttm_bo_init+0x220/0x338 [ttm])
[   13.175480] [<bf003984>] (ttm_bo_init [ttm]) from [<bf0c70a0>] 
(nouveau_bo_new+0x1c0/0x294 [nouveau])
[   13.175688] [<bf0c70a0>] (nouveau_bo_new [nouveau]) from [<bf0ce88c>] 
(nv84_fence_create+0x1cc/0x240 [nouveau])
[   13.175891] [<bf0ce88c>] (nv84_fence_create [nouveau]) from 
[<bf0cec90>] (nvc0_fence_create+0xc/0x24 [nouveau])
[   13.176094] [<bf0cec90>] (nvc0_fence_create [nouveau]) from 
[<bf0c1480>] (nouveau_accel_init+0xec/0x450 [nouveau])

I suspect this is related to this change, but it might also be the 
side-effect of another bug in my code.


More information about the dri-devel mailing list