[PATCH 9/9] drm/amdgpu: validate userq activity status for GEM_VA unmap
Christian König
christian.koenig at amd.com
Fri Jun 6 09:39:50 UTC 2025
On 6/6/25 09:51, Liang, Prike wrote:
> [Public]
>
>> From: Alex Deucher <alexdeucher at gmail.com>
>> Sent: Saturday, May 31, 2025 5:54 AM
>> To: Liang, Prike <Prike.Liang at amd.com>
>> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander
>> <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>;
>> Lazar, Lijo <Lijo.Lazar at amd.com>
>> Subject: Re: [PATCH 9/9] drm/amdgpu: validate userq activity status for GEM_VA
>> unmap
>>
>> On Fri, May 30, 2025 at 4:05 AM Prike Liang <Prike.Liang at amd.com> wrote:
>>>
>>> The userq VA unmap requires validating queue status before unamapping
>>> it, if user tries to unmap an active userq by GEM VA IOCTL then the
>>> driver should report an error for this illegal usage.
>>>
>>> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 16 +++++++++++++---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +++++
>>> 2 files changed, 18 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> index e43a61f64755..e2275280554d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> @@ -234,7 +234,7 @@ amdgpu_userq_map_helper(struct amdgpu_userq_mgr
>> *uq_mgr,
>>> return r;
>>> }
>>>
>>> -static void
>>> +static int
>>> amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr,
>>> struct amdgpu_usermode_queue *queue)
>>> { @@ -243,10 +243,14 @@ amdgpu_userq_wait_for_last_fence(struct
>>> amdgpu_userq_mgr *uq_mgr,
>>>
>>> if (f && !dma_fence_is_signaled(f)) {
>>> ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100));
>>> - if (ret <= 0)
>>> + if (ret <= 0) {
>>> drm_file_err(uq_mgr->file, "Timed out waiting for
>> fence=%llu:%llu\n",
>>> f->context, f->seqno);
>>> + return -ETIMEDOUT;
>>> + }
>>> }
>>> +
>>> + return 0;
>>> }
>>>
>>> static void
>>> @@ -464,7 +468,13 @@ amdgpu_userq_destroy(struct drm_file *filp, int
>> queue_id)
>>> mutex_unlock(&uq_mgr->userq_mutex);
>>> return -EINVAL;
>>> }
>>> - amdgpu_userq_wait_for_last_fence(uq_mgr, queue);
>>> +
>>> + if (amdgpu_userq_wait_for_last_fence(uq_mgr, queue)) {
>>> + drm_warn(adev_to_drm(uq_mgr->adev), "Don't destroy a busy
>> userq\n");
>>> + mutex_unlock(&uq_mgr->userq_mutex);
>>> + return -EINVAL;
>>> + }
>>> +
>>> r = amdgpu_bo_reserve(queue->db_obj.obj, true);
>>> if (!r) {
>>> amdgpu_bo_unpin(queue->db_obj.obj);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 5e075e8f0ca3..168fc3835aaf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -1930,6 +1930,11 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device
>> *adev,
>>> struct amdgpu_vm *vm = bo_va->base.vm;
>>> bool valid = true;
>>>
>>> + if (amdgpu_userq_gem_va_unmap_wait(vm)) {
>>> + dev_warn(adev->dev, "shouldn't unmap the VA for an active userq\n");
>>> + return -EINVAL;
>>> + }
>>
>> There's not need to wait for anything. Just return an error if the userq has not been
>> destroyed yet.
> [Prike] The current userq destroy IOCTL request always comes after amdgpu_vm_bo_unmap(),
> the original destroy validation idea is that the userq BOs only be able to be unmapped when they are
> idle, so here do we need to do the BOs busy wait or just test and throw a warning/error message when
> the driver tries unmap a busy BO?
Yeah that discussion came up previously before.
As far as I can see we can't return an error from unmap because the BO might still be used by a queue.
At least on the gfx side such a behavior could easily lead to crashes because userspace doesn't expect that an unmap fails and tries to re-use the address range.
What we can do is deactivate that queue and prevent it from becomming active again.
Regards,
Christian.
>
>> Alex
>>
>>> +
>>> saddr /= AMDGPU_GPU_PAGE_SIZE;
>>>
>>> list_for_each_entry(mapping, &bo_va->valids, list) {
>>> --
>>> 2.34.1
>>>
More information about the amd-gfx
mailing list