[PATCH 1/3] drm/amdgpu: Add userptr bo support for mGPUs when iommu is on

Xiao, Shane shane.xiao at amd.com
Tue Apr 4 14:43:03 UTC 2023


[AMD Official Use Only - General]



> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling at amd.com>
> Sent: Tuesday, April 4, 2023 9:45 PM
> To: Christian König <ckoenig.leichtzumerken at gmail.com>; Xiao, Shane
> <shane.xiao at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>;
> amd-gfx at lists.freedesktop.org; Yang, Philip <Philip.Yang at amd.com>
> Cc: Liu, Aaron <Aaron.Liu at amd.com>; Guo, Shikai <Shikai.Guo at amd.com>
> Subject: Re: [PATCH 1/3] drm/amdgpu: Add userptr bo support for mGPUs
> when iommu is on
> 
> [+Philip]
> 
> Am 2023-04-04 um 08:47 schrieb Christian König:
> 
> > Am 04.04.23 um 12:56 schrieb Xiao, Shane:
> >> [AMD Official Use Only - General]
> >>
> >>> -----Original Message-----
> >>> From: Koenig, Christian <Christian.Koenig at amd.com>
> >>> Sent: Tuesday, April 4, 2023 6:27 PM
> >>> To: Xiao, Shane <shane.xiao at amd.com>; amd-gfx at lists.freedesktop.org;
> >>> Kuehling, Felix <Felix.Kuehling at amd.com>
> >>> Cc: Liu, Aaron <Aaron.Liu at amd.com>; Guo, Shikai <Shikai.Guo at amd.com>
> >>> Subject: Re: [PATCH 1/3] drm/amdgpu: Add userptr bo support for
> >>> mGPUs when iommu is on
> >>>
> >>> Am 04.04.23 um 11:56 schrieb Shane Xiao:
> >>>> For userptr bo with iommu on, multiple GPUs use same system memory
> >>>> dma mapping address when both bo_adev and adev in identity mode or
> >>>> in the same iommu group.
> >> Hi Christian,
> >>
> >>> WTF? Userptr BOs are not allowed to be exported/imported between
> >>> different GPUs.
> >>>
> >>> So how can the same userptr BO be used on different GPUs?
> >> If GPUs are all in iommu identity mode which means dma address are
> >> the same as physical address,  all of the GPUs can see the system
> >> memory directly.
> >>
> >> In such case, should we export/import the BO,  then create a new SG
> >> BO for another GPU?
> >
> > Yes, absolutely. Each userptr BO is only meant to be used on one GPU.
> >
> > Even if you could use the same BO for multiple GPUs it's not necessary
> > a good idea since you then have live time problems for example.
> >
> > E.g. what happens when one GPU is hot removed while the other one who
> > imported the BO is still in use?
> >
> > Felix can you comment on that? My recollection was that we rather
> > improve the storage of DMA addresses instead of duplicating the BOs
> > over different GPUs.
> 
> For KFD we currently enable sharing of userptr BOs between multiple GPUs by
> creating a userptr BO for the first GPU, and creating additional SG BOs using
> the same page list for additional GPUs. That way we don't have to call
> hmm_range_fault N times and setup N MMU notifiers for the same address
> range on an N GPU system. But we do have to create N DMA mappings, which
> is facilitated by the SG BOs.
> 
> We have a further optimization to not even store separate DMA addresses per-
> GPU if they are a direct mapping. In that case we just increase the reference
> count on the original userptr BO. (I agree that we should also look into more
> efficient storage of DMA addresses. However, last time we talked about this,
> you basically told us that scatter gather tables are being deprecated, but I
> haven't seen the alternative yet.)
> 
> I think this patch fixes a potential issue with that optimization. There is an
> implicit assumption, that the DMA addresses stored in the original userptr BO
> are a direct mapping, so we can reuse them on other GPUs that also use a
> direct mapping. But, we didn't actually check that assumption. I think this patch
> fixes that for systems where system memory is direct mapped on some but not
> all GPUs.
> 
> This scenario should probably be called out explicitly in the patch description.

Yes, I will add this scenario on the comment.

> The condition is also getting pretty hard to read and understand. Maybe move
> the both-direct-map-or-same-iommu-group
> conditions into a helper function, say
> "amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) && reuse_dmamap(adev,
> bo_adev)".

It's a good suggestion. I will make the change.

Best Regards,
Shane

> 
> Regards,
>    Felix
> 
> 
> >
> > Regards,
> > Christian.
> >
> >>
> >>
> >> Best Regards,
> >> Shane
> >>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Signed-off-by: Shane Xiao <shane.xiao at amd.com>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 ++++----
> >>>>    1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>> index e7403f8e4eba..33cda358cb9e 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>> @@ -804,11 +804,11 @@ static int kfd_mem_attach(struct
> >>>> amdgpu_device
> >>> *adev, struct kgd_mem *mem,
> >>>>                 va + bo_size, vm);
> >>>>
> >>>>            if ((adev == bo_adev && !(mem->alloc_flags &
> >>> KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) ||
> >>>> - (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) &&
> >>> adev->ram_is_direct_mapped) ||
> >>>> -            same_hive) {
> >>>> + (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) &&
> >>> ((adev->ram_is_direct_mapped && bo_adev->ram_is_direct_mapped) ||
> >>>> + adev->dev->iommu_group == bo_adev->dev-
> >>>> iommu_group)) ||
> >>>> +same_hive){
> >>>>                /* Mappings on the local GPU, or VRAM mappings in
> >>> the
> >>>> -             * local hive, or userptr mapping IOMMU direct map
> >>> mode
> >>>> -             * share the original BO
> >>>> +             * local hive, or userptr mapping in the same dma
> >>>> +             * address space share the original BO
> >>>>                 */
> >>>>                attachment[i]->type = KFD_MEM_ATT_SHARED;
> >>>>                bo[i] = mem->bo;
> >


More information about the amd-gfx mailing list