[PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

Felix Kuehling felix.kuehling at amd.com
Fri Apr 28 19:56:26 UTC 2023


On 2023-04-28 14:51, Eric Huang wrote:
>
>
> On 2023-04-28 15:42, Felix Kuehling wrote:
>> On 2023-04-28 14:09, Eric Huang wrote:
>>>
>>> On 2023-04-28 12:41, Felix Kuehling wrote:
>>>> On 2023-04-28 10:17, Eric Huang wrote:
>>>>>
>>>>> On 2023-04-27 23:46, Kuehling, Felix wrote:
>>>>>> [AMD Official Use Only - General]
>>>>>>
>>>>>> Re-mapping typically happens after evictions, before a new 
>>>>>> eviction fence gets attached. At that time the old eviction fence 
>>>>>> should be in the signaled state already, so it can't be signaled 
>>>>>> again. Therefore I would expect my patch to help with unmapping 
>>>>>> the DMABuf import, without breaking the eviction case.
>>>>>>
>>>>>> Are you talking about remapping with a map-to-gpu call from user 
>>>>>> mode? I think that would only be a problem if the KFD BO was 
>>>>>> unmapped and remapped multiple times. The first time it's mapped, 
>>>>>> the fresh dmabuf import should be in the SYSTEM domain, so the 
>>>>>> validation in the SYSTEM domain before GTT would be a no-op.
>>>>> Yes. The case scenario I am talking about is from user mode, 
>>>>> mapping->unmapping->re-mapping to the KFD GTT BO will trigger the 
>>>>> eviction.
>>>>>>
>>>>>> I sort of agree that we don't really rely on the eviction fence 
>>>>>> on the DMABuf import. The reservation object is shared with the 
>>>>>> original BO. Moving the original BO triggers the eviction fence, 
>>>>>> so we don't need to trigger it again on the dmabuf import. Other 
>>>>>> than moving the original BO, I don't think we can do anything to 
>>>>>> the DMABuf import that would require an eviction for KFD use 
>>>>>> case. It is a special use case because we control both the import 
>>>>>> and the export in the same context.
>>>>> I am thinking about no adding KFD eviction fence in first place of 
>>>>> mapping original GTT BO, because I don't see it can be evicted in 
>>>>> any cases.
>>>>
>>>> That's not an option. We're not adding an eviction fence. The 
>>>> reservation object with the eviction fence is shared between the 
>>>> exported BO and the imported one. That's just how DMABuf works. If 
>>>> you wait for the fences on the imported BO, you are effectively 
>>>> waiting for the fences on the exported BOs. And you can't remove 
>>>> the eviction fence from the exported BO.
>>>
>>> What if the exported BO will be never evicted in reality? I 
>>> understand how DMABuf works, and imported BO doesn't have eviction 
>>> fence, it shares with exported BO's one if eviction happens, but I 
>>> don't see the exported BO can be evicted.
>>
>> The exported BO can be evicted like any other BO. For example 
>> KFDEvictTest is there to cause and test evictions of KFD VRAM BOs. 
>> Exporting the BO does not pin it (if DMABUF_MOVE_NOTIFIER is enabled, 
>> which it in the upstream kernel), so the exported BO can still be 
>> evicted.
>
> Yes. KFD VRAM BO can be evicted, but DMABuf 's original exported BO is 
> non-paged/GTT BO. Can GTT BO be evicted? It should be like 
> paged/userptr that doesn't have KFD eviction fence.

GTT BOs could be evicted if we're out of GTT space, though that's very 
unlikely. We usually use GTT BOs for things that need to be mapped in 
kernel mode (e.g. signal pages), which means they are pinned and they 
won't be evicted.

But the same DMABuf attachment mechanism is also used for VRAM, which 
can be evicted. We need a common solution that will prevent unnecessary 
evictions with VRAM mapped on PCIe P2P multi-GPU systems.

Regards,
   Felix


>
> Regards,
> Eric
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Regards,
>>> Eric
>>>
>>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>>
>>>>> In theory GTT BO is mapped by user calling mmap() in system memory 
>>>>> like userptr, unlike VRAM it will be not evicted by amdgpu vram 
>>>>> manager. The only thing is CPU invalidation, but GTT BO doesn't 
>>>>> register mmu notifier, that will be a potential problem when 
>>>>> switching paged/userptr to non-paged/GTT for mes scheduler.
>>>>>
>>>>> Regards,
>>>>> Eric
>>>>>>
>>>>>> In the general case dmabuf imports need their eviction fences. 
>>>>>> For example when we're importing a DMABuf from somewhere else, so 
>>>>>> the eviction fence is not shared with a BO that we already 
>>>>>> control. Even then, unmapping a dmabuf from our KFD VM does not 
>>>>>> need to wait for any fences on the DMABuf.
>>>>>>
>>>>>> Regards,
>>>>>>    Felix
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Huang, JinHuiEric <JinHuiEric.Huang at amd.com>
>>>>>> Sent: Thursday, April 27, 2023 14:58
>>>>>> To: Kuehling, Felix <Felix.Kuehling at amd.com>; Koenig, Christian 
>>>>>> <Christian.Koenig at amd.com>; Christian König 
>>>>>> <ckoenig.leichtzumerken at gmail.com>; amd-gfx at lists.freedesktop.org
>>>>>> Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences 
>>>>>> invalidating preemptible DMABuf imports
>>>>>>
>>>>>> Hi Felix,
>>>>>>
>>>>>> I tested your patch on mGPU systems. It doesn't break any KFD 
>>>>>> eviction tests, because tests don't allocate DMABuf import, that 
>>>>>> doesn't trigger it's eviction fence. The only thing the patch 
>>>>>> affects is in re-mapping DMABuf imports that the eviction will 
>>>>>> still be triggered.
>>>>>>
>>>>>> I have an idea that we probably can remove eviction fence for GTT 
>>>>>> bo, because currently the only way to trigger the eviction fence 
>>>>>> is by calling ttm_bo_validate for CPU domain in 
>>>>>> kfd_mem_dmaunmap_dmabuf. Do you know there is other case to 
>>>>>> trigger GTT bo's eviction?
>>>>>>
>>>>>> Regards,
>>>>>> Eric
>>>>>>
>>>>>> On 2023-04-26 22:21, Felix Kuehling wrote:
>>>>>>> Hi Eric,
>>>>>>>
>>>>>>> Can you try if the attached patch fixes the problem without 
>>>>>>> breaking
>>>>>>> the eviction tests on a multi-GPU PCIe P2P system?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>    Felix
>>>>>>>
>>>>>>>
>>>>>>> On 2023-04-26 13:02, Christian König wrote:
>>>>>>>> Am 26.04.23 um 18:58 schrieb Felix Kuehling:
>>>>>>>>> On 2023-04-26 9:03, Christian König wrote:
>>>>>>>>>> Am 25.04.23 um 16:11 schrieb Eric Huang:
>>>>>>>>>>> Hi Christian,
>>>>>>>>>>>
>>>>>>>>>>> What do you think about Felix's explanation?
>>>>>>>>>> That's unfortunately not something we can do here.
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Eric
>>>>>>>>>>>
>>>>>>>>>>> On 2023-04-13 09:28, Felix Kuehling wrote:
>>>>>>>>>>>> Am 2023-04-13 um 07:35 schrieb Christian König:
>>>>>>>>>>>>> Am 13.04.23 um 03:01 schrieb Felix Kuehling:
>>>>>>>>>>>>>> Am 2023-04-12 um 18:25 schrieb Eric Huang:
>>>>>>>>>>>>>>> It is to avoid redundant eviction for KFD's DMAbuf 
>>>>>>>>>>>>>>> import bo
>>>>>>>>>>>>>>> when dmaunmapping DMAbuf. The DMAbuf import bo has been 
>>>>>>>>>>>>>>> set as
>>>>>>>>>>>>>>> AMDGPU_PL_PREEMPT in KFD when mapping.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>>>>>>>>>>>>>> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd like to get an Acked-by from Christian as well before
>>>>>>>>>>>>>> submitting this.
>>>>>>>>>>>>> I have to admit that I only partially followed the internal
>>>>>>>>>>>>> discussion, but in general you need a *really* good 
>>>>>>>>>>>>> explanation
>>>>>>>>>>>>> for this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> E.g. add code comment and explain in the commit message
>>>>>>>>>>>>> extensively why this is needed and why there are no 
>>>>>>>>>>>>> alternatives.
>>>>>>>>>>>> OK. I'll give it a shot:
>>>>>>>>>>>>
>>>>>>>>>>>>     This code path is used among other things when 
>>>>>>>>>>>> invalidating
>>>>>>>>>>>> DMABuf
>>>>>>>>>>>>     imports. These imports share a reservation object with the
>>>>>>>>>>>> exported
>>>>>>>>>>>>     BO. Waiting on all the fences in this reservation will 
>>>>>>>>>>>> trigger
>>>>>>>>>>>> KFD
>>>>>>>>>>>>     eviction fences unnecessarily, for example when a DMABuf
>>>>>>>>>>>> import for
>>>>>>>>>>>>     a DMA mapping on a secondary GPU is being unmapped 
>>>>>>>>>>>> explicitly.
>>>>>>>>>>>> Only
>>>>>>>>>>>>     moving the original exported BO requires stopping KFD user
>>>>>>>>>>>> mode
>>>>>>>>>>>>     queues. If the invalidation is triggered through a move
>>>>>>>>>>>> notifier
>>>>>>>>>>>>     from the exported BO, then moving the original BO already
>>>>>>>>>>>> triggered
>>>>>>>>>>>>     the eviction fence and we don't need to wait for it 
>>>>>>>>>>>> again on
>>>>>>>>>>>> the import.
>>>>>>>>>>>>
>>>>>>>>>>>>     We can identify DMABuf imports in KFD for secondary GPU 
>>>>>>>>>>>> DMA
>>>>>>>>>>>> mappings
>>>>>>>>>>>>     by the mem_type AMDGPU_PL_PREEMPT. In this case, use a 
>>>>>>>>>>>> wait
>>>>>>>>>>>>     operation that ignores KFD eviction fences.
>>>>>>>>>>>>
>>>>>>>>>>>> How does this sound?
>>>>>>>>>> To be honest like quite a bad idea. Why in the world are 
>>>>>>>>>> imported
>>>>>>>>>> BOs moved from GTT to SYSTEM in the first place?
>>>>>>>>> As I understand it, the way to update SG tables in SG BOs (e.g.
>>>>>>>>> userptr and dmabuf imports) is to move them back and forth 
>>>>>>>>> between
>>>>>>>>> system and GTT domains. If we left the import in the GTT 
>>>>>>>>> domain all
>>>>>>>>> the time, we would have no way to update it, e.g. after an 
>>>>>>>>> eviction.
>>>>>>>>> Currently the move to the system domain is done in the unmap 
>>>>>>>>> code path.
>>>>>>>>>
>>>>>>>>> Before memory is freed, we also need to unmap it from GPUVM,
>>>>>>>>> including the DMABuf imports on remote GPUs. For the above reason
>>>>>>>>> that currently includes moving the import to the system 
>>>>>>>>> domain. If
>>>>>>>>> we removed that from the unmap code path, we'd need to do the 
>>>>>>>>> move
>>>>>>>>> to system somewhere else, maybe in the mapping/validation path.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> The only reason for this I can think of is that the DMA mappings
>>>>>>>>>> become invalid for some reasons and in this case waiting for the
>>>>>>>>>> KFD fence is actually the absolutely right thing to do.
>>>>>>>>> In this case the reason the only reason for unmapping the 
>>>>>>>>> memory is
>>>>>>>>> that we're about to free the memory and its DMABuf imports on 
>>>>>>>>> other
>>>>>>>>> GPUs. This is coming from the application with a promise "I'm no
>>>>>>>>> longer accessing the memory". We don't need to wait for fences 
>>>>>>>>> here.
>>>>>>>>> We only need to invalidate the PTEs to make sure that any further
>>>>>>>>> buggy access by the application will fault.
>>>>>>>> Well in this case just free the BO and it's bo_va structure. 
>>>>>>>> The core
>>>>>>>> handling should take care of clearing all the freed up regions.
>>>>>>>>
>>>>>>>> As for updating the SG of a BO you indeed need to move it from 
>>>>>>>> GTT to
>>>>>>>> SYSTEM and back, but in this case we should either indeed wait for
>>>>>>>> the KFD fence since page tables in between the operation still 
>>>>>>>> have
>>>>>>>> the old entries or we should destroy the BO and create a new 
>>>>>>>> one. The
>>>>>>>> later would overwrite the PTEs with invalid entries first and then
>>>>>>>> fill in new valid ones.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>    Felix
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>    Felix
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>    Felix
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 ++++++-
>>>>>>>>>>>>>>>    1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> index 2430f3e9f3a7..64795fe9eecb 100644
>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> @@ -526,7 +526,12 @@ static int amdgpu_bo_move(struct
>>>>>>>>>>>>>>> ttm_buffer_object *bo, bool evict,
>>>>>>>>>>>>>>>        if ((old_mem->mem_type == TTM_PL_TT ||
>>>>>>>>>>>>>>>             old_mem->mem_type == AMDGPU_PL_PREEMPT) &&
>>>>>>>>>>>>>>>            new_mem->mem_type == TTM_PL_SYSTEM) {
>>>>>>>>>>>>>>> -        r = ttm_bo_wait_ctx(bo, ctx);
>>>>>>>>>>>>>>> +        if (old_mem->mem_type == AMDGPU_PL_PREEMPT)
>>>>>>>>>>>>>>> +            r = amdgpu_bo_sync_wait(abo,
>>>>>>>>>>>>>>> + AMDGPU_FENCE_OWNER_KFD,
>>>>>>>>>>>>>>> + ctx->interruptible);
>>>>>>>>>>>>>>> +        else
>>>>>>>>>>>>>>> +            r = ttm_bo_wait_ctx(bo, ctx);
>>>>>>>>>>>>>>>            if (r)
>>>>>>>>>>>>>>>                return r;
>>>>>
>>>
>


More information about the amd-gfx mailing list