[PATCH v8 14/14] drm/amdgpu: validate userq va for GEM unmap
Christian König
christian.koenig at amd.com
Thu Aug 14 12:11:44 UTC 2025
On 14.08.25 12:35, Liang, Prike wrote:
> [Public]
>
>> From: Alex Deucher <alexdeucher at gmail.com>
>> Sent: Thursday, August 14, 2025 6:38 AM
>> To: Liang, Prike <Prike.Liang at amd.com>
>> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander
>> <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
>> Subject: Re: [PATCH v8 14/14] drm/amdgpu: validate userq va for GEM unmap
>>
>> The start address may not align with the start address of the vital
>> queue buffers. Just retain a list of vital virtual address ranges
>> for each userq and then check if the address range the user wants to
>> unmap falls into any of those ranges. If so, then preempt and unmap
>> the queue and set the status to USER_UNMAPPED or something like that.
> [Prike] Each userq has various virtual addresses, but this unmap IOCTL request can only unmap
> one kind of VA at a time
That is not even remotely correct. The CLEAR and REPLACE operations can unmap many VAs at a time.
>, so do we need to validate the userq whether all the VAs have been unmapped
> before unmapping the userq HW mapping?
No, we need to unmap the userq HW mapping before we unmap the VA addresses.
>
> Aside from that, do we still need to identify the invalid userq VA unmap case by checking userq fence to
> see whether it is signaled when user is trying to unmap one of its VAs? Without this check, how do we
> identify the userq GEM unmap VA case?
I don't fully understand the question, please clarify.
Regards,
Christian.
>
>> Then you don't have to worry about queue specific details as to what addresses are
>> vital for that queue type.
>>
>> Alex
>>
>> On Fri, Aug 8, 2025 at 2:29 AM Prike Liang <Prike.Liang at amd.com> wrote:
>>>
>>> This change validates the userq to see whether can be unmapped prior
>>> to the userq VA GEM unmap. The solution is based on the following
>>> idea:
>>> 1) Find out the GEM unmap VA belonds to which userq,
>>> 2) Check the userq fence fence whether is signaled,
>>> 3) If the userq attached fences signal failed, then
>>> mark it as illegal VA opt and give a warning message
>>> for this illegal userspace request.
>>>
>>> Suggested-by: Christian König <christian.koenig at amd.com>
>>> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 107 +++++++++++++++++++++-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 2 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 10 ++
>>> 3 files changed, 118 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> index 771f57d09060..314d482849c8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
>>> @@ -676,7 +676,6 @@ amdgpu_userq_create(struct drm_file *filp, union
>> drm_amdgpu_userq *args)
>>> }
>>> }
>>>
>>> -
>>> args->out.queue_id = qid;
>>>
>>> unlock:
>>> @@ -1214,3 +1213,109 @@ int
>> amdgpu_userq_start_sched_for_enforce_isolation(struct amdgpu_device *adev,
>>> mutex_unlock(&adev->userq_mutex);
>>> return ret;
>>> }
>>> +
>>> +/**
>>> + * amdgpu_userq_gem_va_unmap_queue_retrieve - find out userq by gem
>>> +unmap va
>>> + * @queue: destinated userq for finding out from unmap va
>>> + * @va: the GEM unmap virtual address already aligned in mapping
>>> +range
>>> + * Find out the corresponding userq by comparing
>>> + * the GEM unmap VA with userq VAs.
>>> + */
>>> +static bool amdgpu_userq_gem_va_unmap_queue_retrieve(struct
>> amdgpu_usermode_queue *queue,
>>> + uint64_t va) {
>>> + va = va << AMDGPU_GPU_PAGE_SHIFT | AMDGPU_GMC_HOLE_END;
>>> +
>>> + switch (queue->queue_type) {
>>> + case AMDGPU_HW_IP_GFX:
>>> + if (queue->queue_va == va ||
>>> + queue->wptr_va == va ||
>>> + queue->rptr_va == va ||
>>> + queue->shadow_va == va ||
>>> + queue->csa_va == va)
>>> + return true;
>>> + break;
>>> + case AMDGPU_HW_IP_COMPUTE:
>>> + if (queue->queue_va == va ||
>>> + queue->wptr_va == va ||
>>> + queue->rptr_va == va ||
>>> + queue->eop_va == va)
>>> + return true;
>>> + break;
>>> + case AMDGPU_HW_IP_DMA:
>>> + if (queue->queue_va == va ||
>>> + queue->wptr_va == va ||
>>> + queue->rptr_va == va ||
>>> + queue->csa_va == va)
>>> + return true;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> +
>>> + return false;
>>> +}
>>> +
>>> +
>>> +int amdgpu_userq_gem_va_unmap_validate(struct amdgpu_device *adev,
>>> + uint64_t va) {
>>> + u32 ip_mask = amdgpu_userq_get_supported_ip_mask(adev);
>>> + struct amdgpu_usermode_queue *queue;
>>> + struct amdgpu_userq_mgr *uqm, *tmp;
>>> + int queue_id;
>>> + int ret;
>>> +
>>> + if (!ip_mask)
>>> + return 0;
>>> +
>>> + /**
>>> + * validate the unmap va sequence:
>>> + * 1) Find out the GEM unmap VA belonds to which userq,
>>> + * 2) Check the userq fence whether is signaled,
>>> + * 3) If the userq attached fences signal failed, then
>>> + * mark as invalid va opt and give a warning message
>>> + * for this illegal userspace request.
>>> + */
>>> +
>>> + if (mutex_trylock(&adev->userq_mutex)) {
>>> + list_for_each_entry_safe(uqm, tmp,
>>> + &adev->userq_mgr_list, list) {
>>> +
>>> + if (!mutex_trylock(&uqm->userq_mutex))
>>> + continue;
>>> +
>>> + idr_for_each_entry(&uqm->userq_idr, queue,
>>> + queue_id) {
>>> +
>>> + if (!amdgpu_userq_gem_va_unmap_queue_retrieve(queue,
>> va)) {
>>> + dev_dbg(uqm->adev->dev, "va: 0x%llx not belond to
>> queue id: %d\n",
>>> + va, queue_id);
>>> + continue;
>>> + }
>>> +
>>> + if (queue->last_fence && !dma_fence_is_signaled(queue-
>>> last_fence)) {
>>> + drm_file_err(uqm->file, "an illegal VA unmap for the
>> userq\n");
>>> + queue->state =
>> AMDGPU_USERQ_STATE_INVALID_VA;
>>> + ret = -ETIMEDOUT;
>>> + goto err;
>>> + }
>>> + /*
>>> + * At here still can't suspend the userq since here just one
>> kind of
>>> + * VA unmapped, and some other VAs of userq may still be
>> mapped. After
>>> + * this point this VA mapping will be deteled and the VA will be
>> unmapped
>>> + * so will not resume the userq when its VA unmapped.
>>> + */
>>> + }
>>> + mutex_unlock(&uqm->userq_mutex);
>>> + }
>>> + } else {
>>> + /* Maybe we need a try lock again before return*/
>>> + return -EBUSY;
>>> + }
>>> +
>>> + mutex_unlock(&adev->userq_mutex);
>>> + return 0;
>>> +err:
>>> + mutex_unlock(&uqm->userq_mutex);
>>> + mutex_unlock(&adev->userq_mutex);
>>> + return ret;
>>> +}
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h
>>> index cf35b6140a3d..27ab8a6a7be6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h
>>> @@ -149,4 +149,6 @@ bool amdgpu_userq_buffer_vas_mapped(struct
>>> amdgpu_vm *vm, int amdgpu_userq_buffer_va_put(struct amdgpu_vm *vm,
>>> u64 addr); int amdgpu_userq_buffer_vas_put(struct amdgpu_vm *vm,
>>> struct amdgpu_usermode_queue *queue);
>>> +int amdgpu_userq_gem_va_unmap_validate(struct amdgpu_device *adev,
>>> + uint64_t va);
>>> #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index f042372d9f2e..533954c0d234 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -1929,6 +1929,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device
>> *adev,
>>> struct amdgpu_bo_va_mapping *mapping;
>>> struct amdgpu_vm *vm = bo_va->base.vm;
>>> bool valid = true;
>>> + int r;
>>>
>>> saddr /= AMDGPU_GPU_PAGE_SIZE;
>>>
>>> @@ -1949,6 +1950,15 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device
>> *adev,
>>> return -ENOENT;
>>> }
>>>
>>> + /* It's unlikely to happen that the mapping userq hasn't been idled
>>> + * during user requests GEM unmap IOCTL except for forcing the unmap
>>> + * from user space.
>>> + */
>>> +
>>> + r = amdgpu_userq_gem_va_unmap_validate(adev, saddr);
>>> + if (unlikely(r && r != -EBUSY))
>>> + dev_warn(adev->dev, "Here should be an improper unmap
>>> + request from user space\n");
>>> +
>>> list_del(&mapping->list);
>>> amdgpu_vm_it_remove(mapping, &vm->va);
>>> mapping->bo_va = NULL;
>>> --
>>> 2.34.1
>>>
More information about the amd-gfx
mailing list