[PATCH 1/2] drm/amdkfd: Move queue fs deletion after destroy check
Kim, Jonathan
Jonathan.Kim at amd.com
Wed Sep 11 15:27:17 UTC 2024
[Public]
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Kent
> Russell
> Sent: Tuesday, September 10, 2024 9:37 AM
> To: amd-gfx at lists.freedesktop.org
> Cc: Russell, Kent <Kent.Russell at amd.com>
> Subject: [PATCH 1/2] drm/amdkfd: Move queue fs deletion after destroy check
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> We were removing the kernfs entry for queue info before checking if the
> queue could be destroyed. If it failed to get destroyed (e.g. during
> some GPU resets), then we would try to delete it later during pqm
> teardown, but the file was already removed. This led to a kernel WARN
> trying to remove size, gpuid and type. Move the remove to after the
> destroy check.
>
> Signed-off-by: Kent Russell <kent.russell at amd.com>
This patch is:
Reviewed-by: Jonathan Kim <jonathan.kim at amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index b439d4d0bd84..01b960b15274 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -517,7 +517,6 @@ int pqm_destroy_queue(struct process_queue_manager
> *pqm, unsigned int qid)
> if (retval)
> goto err_destroy_queue;
>
> - kfd_procfs_del_queue(pqn->q);
> dqm = pqn->q->device->dqm;
> retval = dqm->ops.destroy_queue(dqm, &pdd->qpd, pqn->q);
> if (retval) {
> @@ -527,6 +526,7 @@ int pqm_destroy_queue(struct process_queue_manager
> *pqm, unsigned int qid)
> if (retval != -ETIME)
> goto err_destroy_queue;
> }
> + kfd_procfs_del_queue(pqn->q);
> kfd_queue_release_buffers(pdd, &pqn->q->properties);
> pqm_clean_queue_resource(pqm, pqn);
> uninit_queue(pqn->q);
> --
> 2.34.1
More information about the amd-gfx
mailing list