[PATCH 1/3] drm/amdgpu: Add userptr bo support for mGPUs when iommu is on

Wed Apr 12 13:35:48 UTC 2023

On 2023-04-04 09:45, Felix Kuehling wrote:
> [+Philip]
>
> Am 2023-04-04 um 08:47 schrieb Christian König:
>
>> Am 04.04.23 um 12:56 schrieb Xiao, Shane:
>>> [AMD Official Use Only - General]
>>>
>>>> -----Original Message-----
>>>> From: Koenig, Christian <Christian.Koenig at amd.com>
>>>> Sent: Tuesday, April 4, 2023 6:27 PM
>>>> To: Xiao, Shane <shane.xiao at amd.com>; amd-gfx at lists.freedesktop.org;
>>>> Kuehling, Felix <Felix.Kuehling at amd.com>
>>>> Cc: Liu, Aaron <Aaron.Liu at amd.com>; Guo, Shikai <Shikai.Guo at amd.com>
>>>> Subject: Re: [PATCH 1/3] drm/amdgpu: Add userptr bo support for mGPUs
>>>> when iommu is on
>>>>
>>>> Am 04.04.23 um 11:56 schrieb Shane Xiao:
>>>>> For userptr bo with iommu on, multiple GPUs use same system memory 
>>>>> dma
>>>>> mapping address when both bo_adev and adev in identity mode or in the
>>>>> same iommu group.
>>> Hi Christian,
>>>
>>>> WTF? Userptr BOs are not allowed to be exported/imported between 
>>>> different
>>>> GPUs.
>>>>
>>>> So how can the same userptr BO be used on different GPUs?
>>> If GPUs are all in iommu identity mode which means dma address are 
>>> the same as physical address,  all of the GPUs can see the system 
>>> memory directly.
>>>
>>> In such case, should we export/import the BO,  then create a new SG 
>>> BO for another GPU?
>>
>> Yes, absolutely. Each userptr BO is only meant to be used on one GPU.
>>
>> Even if you could use the same BO for multiple GPUs it's not 
>> necessary a good idea since you then have live time problems for 
>> example.
>>
>> E.g. what happens when one GPU is hot removed while the other one who 
>> imported the BO is still in use?
>>
>> Felix can you comment on that? My recollection was that we rather 
>> improve the storage of DMA addresses instead of duplicating the BOs 
>> over different GPUs.
>
> For KFD we currently enable sharing of userptr BOs between multiple 
> GPUs by creating a userptr BO for the first GPU, and creating 
> additional SG BOs using the same page list for additional GPUs. That 
> way we don't have to call hmm_range_fault N times and setup N MMU 
> notifiers for the same address range on an N GPU system. But we do 
> have to create N DMA mappings, which is facilitated by the SG BOs.
>
> We have a further optimization to not even store separate DMA 
> addresses per-GPU if they are a direct mapping. In that case we just 
> increase the reference count on the original userptr BO. (I agree that 
> we should also look into more efficient storage of DMA addresses. 
> However, last time we talked about this, you basically told us that 
> scatter gather tables are being deprecated, but I haven't seen the 
> alternative yet.)
>
> I think this patch fixes a potential issue with that optimization. 
> There is an implicit assumption, that the DMA addresses stored in the 
> original userptr BO are a direct mapping, so we can reuse them on 
> other GPUs that also use a direct mapping. But, we didn't actually 
> check that assumption. I think this patch fixes that for systems where 
> system memory is direct mapped on some but not all GPUs.

We will need similar optimization in svm_range_dma_map to reuse the 
prange->dma_addr for GPUs on same iommu group or direct mapping.

Regards,

Philip

>
> This scenario should probably be called out explicitly in the patch 
> description. The condition is also getting pretty hard to read and 
> understand. Maybe move the both-direct-map-or-same-iommu-group 
> conditions into a helper function, say 
> "amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) && reuse_dmamap(adev, 
> bo_adev)".
>
> Regards,
>   Felix
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>>
>>> Best Regards,
>>> Shane
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Signed-off-by: Shane Xiao <shane.xiao at amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 ++++----
>>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> index e7403f8e4eba..33cda358cb9e 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> @@ -804,11 +804,11 @@ static int kfd_mem_attach(struct amdgpu_device
>>>> *adev, struct kgd_mem *mem,
>>>>>                 va + bo_size, vm);
>>>>>
>>>>>            if ((adev == bo_adev && !(mem->alloc_flags &
>>>> KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) ||
>>>>> - (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) &&
>>>> adev->ram_is_direct_mapped) ||
>>>>> -            same_hive) {
>>>>> + (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) &&
>>>> ((adev->ram_is_direct_mapped && bo_adev->ram_is_direct_mapped) ||
>>>>> + adev->dev->iommu_group == bo_adev->dev-
>>>>> iommu_group)) ||
>>>>> +same_hive){
>>>>>                /* Mappings on the local GPU, or VRAM mappings in
>>>> the
>>>>> -             * local hive, or userptr mapping IOMMU direct map
>>>> mode
>>>>> -             * share the original BO
>>>>> +             * local hive, or userptr mapping in the same dma
>>>>> +             * address space share the original BO
>>>>>                 */
>>>>>                attachment[i]->type = KFD_MEM_ATT_SHARED;
>>>>>                bo[i] = mem->bo;
>>