[PATCH] drm/amdgpu: Fix Illegal opcode in command stream Error
Christian König
christian.koenig at amd.com
Thu Dec 19 15:00:13 UTC 2024
Am 19.12.24 um 15:56 schrieb Arvind Yadav:
> When applications closes, it triggers the drm_file_free
> function which subsequently releases all allocated buffer
> objects. Concurrently, the resume_worker thread will attempt
> to map the usermode queue. However, since the wptr buffer
> object has already been deallocated, this will result in
> an Illegal opcode error being raised in the command stream.
> Now the usermode queue will not be mapped if the wptr buffer
> object is freed.
Clear NAK to that approach. Instead we need to suspend the queues and
prevent them from restarting before freeing any BO.
Regards,
Christian.
>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Cc: Christian Koenig <christian.koenig at amd.com>
> Signed-off-by: Arvind Yadav <arvind.yadav at amd.com>
> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 22 +++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index c11fcdd604fc..378a6284e05b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -28,6 +28,21 @@
> #include "amdgpu_userqueue.h"
> #include "amdgpu_userq_fence.h"
>
> +static bool
> +amdgpu_userq_validate_bo_mapping(struct amdgpu_usermode_queue *queue,
> + uint64_t addr)
> +{
> + struct amdgpu_bo_va_mapping *mapping;
> + struct amdgpu_vm *vm = queue->vm;
> +
> + addr &= AMDGPU_GMC_HOLE_MASK;
> + mapping = amdgpu_vm_bo_lookup_mapping(vm, addr >> PAGE_SHIFT);
> + if (!mapping)
> + return false;
> +
> + return true;
> +}
> +
> static void amdgpu_userq_walk_and_drop_fence_drv(struct xarray *xa)
> {
> struct amdgpu_userq_fence_driver *fence_drv;
> @@ -390,9 +405,12 @@ amdgpu_userqueue_resume_all(struct amdgpu_userq_mgr *uq_mgr)
> userq_funcs = adev->userq_funcs[AMDGPU_HW_IP_GFX];
>
> /* Resume all the queues for this process */
> - idr_for_each_entry(&uq_mgr->userq_idr, queue, queue_id)
> - ret = userq_funcs->resume(uq_mgr, queue);
> + idr_for_each_entry(&uq_mgr->userq_idr, queue, queue_id) {
> + if (amdgpu_userq_validate_bo_mapping(queue,
> + queue->userq_prop->wptr_gpu_addr))
> + ret = userq_funcs->resume(uq_mgr, queue);
>
> + }
> if (ret)
> DRM_ERROR("Failed to resume all the queue\n");
> return ret;
More information about the amd-gfx
mailing list