[PATCH 1/8] drm/tegra: Add Tegra DRM allocation API
Mikko Perttunen
cyndis at kapsi.fi
Wed Dec 7 09:23:57 UTC 2016
On 05.12.2016 20:37, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:38PM +0200, Mikko Perttunen wrote:
>> Add a new IO virtual memory allocation API to allow clients to
>> allocate non-GEM memory in the Tegra DRM IOMMU domain. This is
>> required e.g. for loading client firmware when clients are attached
>> to the IOMMU domain.
>>
>> The allocator allocates contiguous physical pages that are then
>> mapped contiguously to the IOMMU domain using the iova_domain
>> library provided by the kernel. Contiguous physical pages are
>> used so that the same allocator works also when IOMMU support
>> is disabled and therefore devices access physical memory directly.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen at nvidia.com>
>> ---
>> drivers/gpu/drm/tegra/drm.c | 98 ++++++++++++++++++++++++++++++++++++++++++---
>> drivers/gpu/drm/tegra/drm.h | 7 ++++
>> 2 files changed, 100 insertions(+), 5 deletions(-)
>
> This looks mostly good, just a few comments below...
>
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index b8be3ee..ecfe7ba 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -1,12 +1,13 @@
>> /*
>> * Copyright (C) 2012 Avionic Design GmbH
>> - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved.
>> + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved.
>
> It's almost 2017 now, I think this is out of date. =)
>
>> *
>> * This program is free software; you can redistribute it and/or modify
>> * it under the terms of the GNU General Public License version 2 as
>> * published by the Free Software Foundation.
>> */
>>
>> +#include <linux/bitops.h>
>> #include <linux/host1x.h>
>> #include <linux/iommu.h>
>>
>> @@ -23,6 +24,8 @@
>> #define DRIVER_MINOR 0
>> #define DRIVER_PATCHLEVEL 0
>>
>> +#define IOVA_AREA_SZ (1024 * 1024 * 64) /* 64 MiB */
>
> SZ_64M?
Fixed.
>
>> +
>> struct tegra_drm_file {
>> struct list_head contexts;
>> };
>> @@ -127,7 +130,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>
>> if (iommu_present(&platform_bus_type)) {
>> struct iommu_domain_geometry *geometry;
>> - u64 start, end;
>> + unsigned long order;
>> + u64 iova_start, start, end;
>
> Can we be a little more explicit and keep an additional iova_end to
> simplify the code a little?
Good idea. I replaced this with gem_{start,end} and carveout_{start,end}
and it looks nicer.
>
>> tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> if (!tegra->domain) {
>> @@ -138,10 +142,17 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> geometry = &tegra->domain->geometry;
>> start = geometry->aperture_start;
>> end = geometry->aperture_end;
>> + iova_start = end - IOVA_AREA_SZ + 1;
>> +
>> + order = __ffs(tegra->domain->pgsize_bitmap);
>> + init_iova_domain(&tegra->iova, 1UL << order,
>> + iova_start >> order,
>> + end >> order);
>
> I hadn't seen this IOVA domain thing and it looks rather handy.
Yes, definitely nicer than rolling by hand. Could be a bit easier still
though :) Too many "<< order" everywhere.
>
>>
>> - DRM_DEBUG_DRIVER("IOMMU aperture initialized (%#llx-%#llx)\n",
>> - start, end);
>> - drm_mm_init(&tegra->mm, start, end - start + 1);
>> + drm_mm_init(&tegra->mm, start, iova_start - start);
>> +
>> + DRM_DEBUG("IOMMU apertures initialized (GEM: %#llx-%#llx, IOVA: %#llx-%#llx)\n",
>> + start, iova_start-1, iova_start, end);
>
> Might be worth splitting this into two, or perhaps even three lines:
>
> IOMMU apertures:
> GEM: %#llx-%#llx
> IOVA: %#llx-%#llx
>
> Maybe also find a better name for IOVA, since GEM will be using I/O
> virtual addresses as well. Perhaps just name it "carveout" or something
> like that?
Done and renamed to carveout.
>
>> }
>>
>> mutex_init(&tegra->clients_lock);
>> @@ -208,6 +219,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> if (tegra->domain) {
>> iommu_domain_free(tegra->domain);
>> drm_mm_takedown(&tegra->mm);
>> + put_iova_domain(&tegra->iova);
>> }
>> free:
>> kfree(tegra);
>> @@ -232,6 +244,7 @@ static int tegra_drm_unload(struct drm_device *drm)
>> if (tegra->domain) {
>> iommu_domain_free(tegra->domain);
>> drm_mm_takedown(&tegra->mm);
>> + put_iova_domain(&tegra->iova);
>> }
>>
>> kfree(tegra);
>> @@ -975,6 +988,81 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
>> return 0;
>> }
>>
>> +void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size,
>> + dma_addr_t *iova)
>
> Maybe name the iova parameter phys? iova implies that it's always
> translated, whereas according to the code it can be physical, too.
How about something like "dma"? I find the use of "phys" for device
addresses slightly confusing since sometimes they are and sometimes not
and it's hard to say at a glance if the code in question actually
assumes that the addresses are physical or not. "dma" should convey the
meaning of "address used by device" a bit better.
>
>> +{
>> + struct iova *alloc;
>> + unsigned long shift;
>> + void *virt;
>> + gfp_t gfp;
>> + int err;
>> +
>> + if (tegra->domain)
>> + size = iova_align(&tegra->iova, size);
>> + else
>> + size = PAGE_ALIGN(size);
>> +
>> + gfp = GFP_KERNEL | __GFP_ZERO;
>> + if (!tegra->domain) {
>> + /*
>> + * Tegra210 has 64-bit physical addresses but units only support
>> + * 32-bit addresses, so if we don't have an IOMMU to do
>> + * translation, force allocations to be in the 32-bit range.
>> + */
>> + gfp |= GFP_DMA;
>
> Technically we do support Tegra132 and that already has 64-bit physical
> addresses, so perhaps include that in the comment, or make it more
> generic:
>
> /*
> * Many units only support 32-bit addresses, even on 64-bit SoCs.
> * If there is no IOMMU to translate into a 32-bit IO virtual address
> * address space, force allocations to be in the lower 32-bit range.
> */
>
Will fix.
>> + }
>> +
>> + virt = (void *)__get_free_pages(gfp, get_order(size));
>> + if (!virt)
>> + return NULL;
>
> Perhaps we want to return an ERR_PTR()-encoded error code here so we can
> differentiate between out-of-memory and out-of-virtual-address-space
> conditions in the caller? Or perhaps other conditions as well.
Sure.
>
> Also, is __get_free_pages() the right API here? I thought that it always
> returned contiguous memory, which, in the presence of an IOMMU, will be
> completely unnecessary.
That is true. However, this API is intended only for a few small
allocations is done by the kernel, so it shouldn't be that bad. And
sources on the internet say that vmalloc should only be used when
absolutely necessary :)
>
>> +
>> + if (!tegra->domain) {
>> + /*
>> + * If IOMMU is disabled, devices address physical memory
>> + * directly.
>> + */
>> + *iova = virt_to_phys(virt);
>> + return virt;
>> + }
>> +
>> + shift = iova_shift(&tegra->iova);
>> + alloc = alloc_iova(&tegra->iova, size >> shift,
>> + tegra->domain->geometry.aperture_end >> shift, true);
>
> This is fairly cumbersome to use. Maybe a drm_mm would be easier in this
> case? If not, could we at least store the upper limit of the carveout so
> that we don't have to use the really long expression above?
Stored shift and aperture_end >> shift.
>
> Thierry
>
Thanks,
Mikko.
More information about the dri-devel
mailing list