[RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

Friedrich Vock friedrich.vock at gmx.de
Thu Apr 25 07:39:30 UTC 2024


On 25.04.24 09:15, Christian König wrote:
> Am 25.04.24 um 09:06 schrieb Friedrich Vock:
>> On 25.04.24 08:58, Christian König wrote:
>>> Am 25.04.24 um 08:46 schrieb Friedrich Vock:
>>>> On 25.04.24 08:32, Christian König wrote:
>>>>> Am 24.04.24 um 18:57 schrieb Friedrich Vock:
>>>>>> Used by userspace to adjust buffer priorities in response to
>>>>>> changes in
>>>>>> application demand and memory pressure.
>>>>>
>>>>> Yeah, that was discussed over and over again. One big design criteria
>>>>> is that we can't have global priorities from userspace!
>>>>>
>>>>> The background here is that this can trivially be abused.
>>>>>
>>>> I see your point when apps are allowed to prioritize themselves above
>>>> other apps, and I agree that should probably be disallowed at least
>>>> for
>>>> unprivileged apps.
>>>>
>>>> Disallowing this is a pretty trivial change though, and I don't really
>>>> see the abuse potential in being able to downgrade your own priority?
>>>
>>> Yeah, I know what you mean and I'm also leaning towards that
>>> argumentation. But another good point is also that it doesn't actually
>>> help.
>>>
>>> For example when you have desktop apps fighting with a game, you
>>> probably don't want to use static priorities, but rather evict the
>>> apps which are inactive and keep the apps which are active in the
>>> background.
>>>
>> Sadly things are not as simple as "evict everything from app 1, keep
>> everything from app 2 active". The simplest failure case of this is
>> games that already oversubscribe VRAM on their own. Keeping the whole
>> app inside VRAM is literally impossible there, and it helps a lot to
>> know which buffers the app is most happy with evicting.
>>> In other words the priority just tells you which stuff from each app
>>> to evict first, but not which app to globally throw out.
>>>
>> Yeah, but per-buffer priority system could do both of these.
>
> Yeah, but we already have that. See amdgpu_bo_list_entry_cmp() and
> amdgpu_bo_list_create().
>
> This is the per application priority which can be used by userspace to
> give priority to each BO in a submission (or application wide).
>
> The problem is rather that amdgpu/TTM never really made good use of
> that information.
>
I think it's nigh impossible to make good use of priority information if
you wrap it in the BO list which you only know on submit. For example,
you don't know when priorities change unless you duplicate all the
tracking work (that the application has to do too!) in the kernel. You
also have no way of knowing the priority changed until right when the
app wants to submit work using that BO, and starting to move BOs around
at that point is bad for submission latency. That's why I didn't go
forward with tracking priorities on a BO-list basis.

Also, the priorities being local to a submission is actually not that
great when talking about lowering priorities. Consider a case where an
app's working set fits into VRAM completely, but combined with the
working set of other apps running in parallel, VRAM is oversubscribed.
The app recognizes this and asks the kernel to evict one of its
rarely-used buffers by setting the priority to the lowest possible, to
make space for the other applications.
Without global priorities, the kernel can't honor that request, even
though it would solve the oversubscription with minimal performance
impact. Even with per-app priorities, the kernel isn't likely to evict
buffers from the requesting application unless all the other
applications have a higher priority.

Regards,
Friedrich

> Regards,
> Christian.
>
>>
>> Regards,
>> Friedrich
>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Regards,
>>>> Friedrich
>>>>
>>>>> What we can do is to have per process priorities, but that needs
>>>>> to be
>>>>> in the VM subsystem.
>>>>>
>>>>> That's also the reason why I personally think that the handling
>>>>> shouldn't be inside TTM at all.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> Signed-off-by: Friedrich Vock <friedrich.vock at gmx.de>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 20 ++++++++++++++++++++
>>>>>>   include/uapi/drm/amdgpu_drm.h           |  1 +
>>>>>>   2 files changed, 21 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> index 5ca13e2e50f50..6107810a9c205 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> @@ -836,8 +836,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
>>>>>> void *data,
>>>>>>   {
>>>>>>       struct amdgpu_device *adev = drm_to_adev(dev);
>>>>>>       struct drm_amdgpu_gem_op *args = data;
>>>>>> +    struct ttm_resource_manager *man;
>>>>>>       struct drm_gem_object *gobj;
>>>>>>       struct amdgpu_vm_bo_base *base;
>>>>>> +    struct ttm_operation_ctx ctx;
>>>>>>       struct amdgpu_bo *robj;
>>>>>>       int r;
>>>>>>
>>>>>> @@ -851,6 +853,9 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
>>>>>> void *data,
>>>>>>       if (unlikely(r))
>>>>>>           goto out;
>>>>>>
>>>>>> +    memset(&ctx, 0, sizeof(ctx));
>>>>>> +    ctx.interruptible = true;
>>>>>> +
>>>>>>       switch (args->op) {
>>>>>>       case AMDGPU_GEM_OP_GET_GEM_CREATE_INFO: {
>>>>>>           struct drm_amdgpu_gem_create_in info;
>>>>>> @@ -898,6 +903,21 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
>>>>>> void *data,
>>>>>>
>>>>>>           amdgpu_bo_unreserve(robj);
>>>>>>           break;
>>>>>> +    case AMDGPU_GEM_OP_SET_PRIORITY:
>>>>>> +        if (args->value > AMDGPU_BO_PRIORITY_MAX_USER)
>>>>>> +            args->value = AMDGPU_BO_PRIORITY_MAX_USER;
>>>>>> +        ttm_bo_update_priority(&robj->tbo, args->value);
>>>>>> +        if (robj->tbo.evicted_type != TTM_NUM_MEM_TYPES) {
>>>>>> +            ttm_bo_try_unevict(&robj->tbo, &ctx);
>>>>>> +            amdgpu_bo_unreserve(robj);
>>>>>> +        } else {
>>>>>> +            amdgpu_bo_unreserve(robj);
>>>>>> +            man = ttm_manager_type(robj->tbo.bdev,
>>>>>> +                robj->tbo.resource->mem_type);
>>>>>> +            ttm_mem_unevict_evicted(robj->tbo.bdev, man,
>>>>>> +                        true);
>>>>>> +        }
>>>>>> +        break;
>>>>>>       default:
>>>>>>           amdgpu_bo_unreserve(robj);
>>>>>>           r = -EINVAL;
>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h
>>>>>> b/include/uapi/drm/amdgpu_drm.h
>>>>>> index bdbe6b262a78d..53552dd489b9b 100644
>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>> @@ -531,6 +531,7 @@ union drm_amdgpu_wait_fences {
>>>>>>
>>>>>>   #define AMDGPU_GEM_OP_GET_GEM_CREATE_INFO    0
>>>>>>   #define AMDGPU_GEM_OP_SET_PLACEMENT        1
>>>>>> +#define AMDGPU_GEM_OP_SET_PRIORITY              2
>>>>>>
>>>>>>   /* Sets or returns a value associated with a buffer. */
>>>>>>   struct drm_amdgpu_gem_op {
>>>>>> --
>>>>>> 2.44.0
>>>>>>
>>>>>
>>>
>


More information about the dri-devel mailing list