[PATCH] drm/amdgpu: Fix Illegal opcode in command stream Error

Christian König christian.koenig at amd.com
Thu Dec 19 15:00:13 UTC 2024


Am 19.12.24 um 15:56 schrieb Arvind Yadav:
> When applications closes, it triggers the drm_file_free
> function which subsequently releases all allocated buffer
> objects. Concurrently, the resume_worker thread will attempt
> to map the usermode queue. However, since the wptr buffer
> object has already been deallocated, this will result in
> an Illegal opcode error being raised in the command stream.
> Now the usermode queue will not be mapped if the wptr buffer
> object is freed.

Clear NAK to that approach. Instead we need to suspend the queues and 
prevent them from restarting before freeing any BO.

Regards,
Christian.

>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Cc: Christian Koenig <christian.koenig at amd.com>
> Signed-off-by: Arvind Yadav <arvind.yadav at amd.com>
> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 22 +++++++++++++++++--
>   1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index c11fcdd604fc..378a6284e05b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -28,6 +28,21 @@
>   #include "amdgpu_userqueue.h"
>   #include "amdgpu_userq_fence.h"
>   
> +static bool
> +amdgpu_userq_validate_bo_mapping(struct amdgpu_usermode_queue *queue,
> +				 uint64_t addr)
> +{
> +	struct amdgpu_bo_va_mapping *mapping;
> +	struct amdgpu_vm *vm = queue->vm;
> +
> +	addr &= AMDGPU_GMC_HOLE_MASK;
> +	mapping = amdgpu_vm_bo_lookup_mapping(vm, addr >> PAGE_SHIFT);
> +	if (!mapping)
> +		return false;
> +
> +	return true;
> +}
> +
>   static void amdgpu_userq_walk_and_drop_fence_drv(struct xarray *xa)
>   {
>   	struct amdgpu_userq_fence_driver *fence_drv;
> @@ -390,9 +405,12 @@ amdgpu_userqueue_resume_all(struct amdgpu_userq_mgr *uq_mgr)
>   	userq_funcs = adev->userq_funcs[AMDGPU_HW_IP_GFX];
>   
>   	/* Resume all the queues for this process */
> -	idr_for_each_entry(&uq_mgr->userq_idr, queue, queue_id)
> -		ret = userq_funcs->resume(uq_mgr, queue);
> +	idr_for_each_entry(&uq_mgr->userq_idr, queue, queue_id) {
> +		if (amdgpu_userq_validate_bo_mapping(queue,
> +				queue->userq_prop->wptr_gpu_addr))
> +			ret = userq_funcs->resume(uq_mgr, queue);
>   
> +	}
>   	if (ret)
>   		DRM_ERROR("Failed to resume all the queue\n");
>   	return ret;



More information about the amd-gfx mailing list