[PATCH 2/2] drm/amdkfd: Walk thorugh list with dqm lock hold

Chen, Jiansong (Simon) Jiansong.Chen at amd.com
Thu Jun 17 03:05:29 UTC 2021


[AMD Official Use Only]

BTW, there is an obvious typo in the subject, Walk thorugh => Walk through.

Regards,
Jiansong
-----Original Message-----
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Felix Kuehling
Sent: Thursday, June 17, 2021 7:09 AM
To: Pan, Xinhui <Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
Subject: Re: [PATCH 2/2] drm/amdkfd: Walk thorugh list with dqm lock hold

On 2021-06-16 4:35 a.m., xinhui pan wrote:
> To avoid any list corruption.
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
>   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c    | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index c24ab8f17eb6..1f84de861ec6 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1704,7 +1704,7 @@ static int process_termination_cpsch(struct device_queue_manager *dqm,
>               struct qcm_process_device *qpd)
>   {
>       int retval;
> -     struct queue *q, *next;
> +     struct queue *q;
>       struct kernel_queue *kq, *kq_next;
>       struct mqd_manager *mqd_mgr;
>       struct device_process_node *cur, *next_dpn; @@ -1739,8 +1739,6 @@
> static int process_termination_cpsch(struct device_queue_manager *dqm,
>                               qpd->mapped_gws_queue = false;
>                       }
>               }
> -
> -             dqm->total_queue_count--;

I think this should stay here. This is only used to check the maximum user queue limit per-device, which is a HW limitation. As far as the HW is concerned, the queues are destroyed after the call to execute_queues_cpsch. So there is no need to delay this update.


>       }
>
>       /* Unregister process */
> @@ -1772,13 +1770,19 @@ static int process_termination_cpsch(struct device_queue_manager *dqm,
>       /* Lastly, free mqd resources.
>        * Do free_mqd() after dqm_unlock to avoid circular locking.
>        */
> -     list_for_each_entry_safe(q, next, &qpd->queues_list, list) {
> +     dqm_lock(dqm);

Instead of taking the dqm lock again, just move this up a couple of lines before the dqm_unlock call.

Regards,
   Felix


> +     while (!list_empty(&qpd->queues_list)) {
> +             q = list_first_entry(&qpd->queues_list, struct queue, list);
>               mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
>                               q->properties.type)];
>               list_del(&q->list);
>               qpd->queue_count--;
> +             dqm->total_queue_count--;
> +             dqm_unlock(dqm);
>               mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
> +             dqm_lock(dqm);
>       }
> +     dqm_unlock(dqm);
>
>       return retval;
>   }
_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CJiansong.Chen%40amd.com%7C47251235ca70449c924608d9311bc4ce%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637594817623083340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=h%2FzO6DHA%2F6%2Btw0iJp4aBFfw8KZPVgtmgkfj3VQho4pM%3D&reserved=0


More information about the amd-gfx mailing list