[PATCH 1/2] drm/ttm: Don't evict SG BOs

Wed Apr 28 16:49:03 UTC 2021

Am 2021-04-28 um 12:33 p.m. schrieb Christian König:
> Am 28.04.21 um 17:19 schrieb Felix Kuehling:
>> Am 2021-04-28 um 5:05 a.m. schrieb Christian König:
>> [SNIP]
>> Hmm, I was missing something. The amdgpu_gtt_mgr doesn't actually
>> allocate space for many BOs:
>>
>>          if (!place->lpfn) {
>>                  mem->mm_node = NULL;
>>                  mem->start = AMDGPU_BO_INVALID_OFFSET;
>>                  return 0;
>>          }
>>
>> I think our userptr BOs don't have mm_nodes and don't use GTT space. So
>> I could add a check for that to amdgpu_ttm_bo_eviction_valuable.
>
> That's for allocating GART space and completely unrelated here.

Ah, I see, the GTT space allocation doesn't use an mm_node, but just the
mgr->available atomic counter.

>
> [SNIP]
>>>> Failing that, I'd probably have to abandon userptr BOs altogether and
>>>> switch system memory mappings over to using the new SVM API on systems
>>>> where it is avaliable.
>>> Well as long as that provides the necessary functionality through HMM
>>> it would be an option.
>> Just another way of circumventing "It should limit the amount of system
>> memory the GPU can access at the same time," a premise I disagree with
>> in case of userptrs and HMM. Both use pageable, unpinned memory.
>
>> Both can cause the GPU to be preempted in case of MMU interval
>> notifiers.
>
> Well that's the key point. GFX userptrs and DMA-buf imports can't be
> preempted.

But they don't need to be. They don't use any resources on the importing
GPU or system memory, so why do we limit them?

With dynamic attachment, the exported BOs can be evicted and that
affects the imports as well. I don't see why the import needs to be
evicted as if there was some resource limitation on the importing GPU.

>
> So they basically lock the backing memory until the last submission is
> completed and that is causing problems if it happens for to much
> memory at the same time.
>
> What we could do is to figure out in the valuable callback if the BO
> is preempt-able or not.

Then we should also not count them in mgr->available. Otherwise not
evicting these BOs can block other GTT allocations. Again, maybe it's
easier to use a different domain for preemptible BOs.

Regards,
  Felix

>
> Regards,
> Christian.
>
>> Statically limiting the amount of pageable memory accessible to GTT is
>> redundant and overly limiting.
>>
>> Regards,
>>    Felix
>>
>>
>>> Regards,
>>> Christian.
>>>
>>>> Regards,
>>>>     Felix
>>>>
>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/ttm/ttm_bo.c | 4 ++++
>>>>>>     1 file changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> index de1ec838cf8b..0b953654fdbf 100644
>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> @@ -655,6 +655,10 @@ int ttm_mem_evict_first(struct ttm_device
>>>>>> *bdev,
>>>>>>             list_for_each_entry(bo, &man->lru[i], lru) {
>>>>>>                 bool busy;
>>>>>>     +            /* Don't evict SG BOs */
>>>>>> +            if (bo->ttm && bo->ttm->sg)
>>>>>> +                continue;
>>>>>> +
>>>>>>                 if (!ttm_bo_evict_swapout_allowable(bo, ctx,
>>>>>> &locked,
>>>>>>                                     &busy)) {
>>>>>>                     if (busy && !busy_bo && ticket !=
>