[PATCH v2] drm/amdkfd: change kfd process kref count at creation
Chen, Xiaogang
xiaogang.chen at amd.com
Fri Oct 18 22:02:11 UTC 2024
On 10/18/2024 2:14 PM, Felix Kuehling wrote:
>
> On 2024-10-11 10:41, Xiaogang.Chen wrote:
>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>
>> kfd process kref count(process->ref) is initialized to 1 by
>> kref_init. After
>> it is created not need to increaes its kref. Instad add kfd process
>> kref at kfd
>> process mmu notifier allocation since we decrease the ref at
>> free_notifier of
>> mmu_notifier_ops, so pair them.
>>
>> When user process opens kfd node multiple times the kfd process kref is
>> increased each time to balance kfd node close operation.
>>
>> Signed-off-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>
> Thanks, this is an elegant solution, IMO. The reference returned by
> kfd_create_process comes either from find_process or create_process.
> And the extra reference that gets released by the free_notifier gets
> allocated by the alloc_notifier. I think there is a race condition,
> though. See inline.
>
>
>> ---
>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 15 ++++++++++-----
>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index d07acf1b2f93..78bf918abf92 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -850,8 +850,10 @@ struct kfd_process *kfd_create_process(struct
>> task_struct *thread)
>> goto out;
>> }
>> - /* A prior open of /dev/kfd could have already created the
>> process. */
>> - process = find_process(thread, false);
>> + /* A prior open of /dev/kfd could have already created the process.
>> + * find_process will increase process kref in this case
>> + */
>> + process = find_process(thread, true);
>> if (process) {
>> pr_debug("Process already found\n");
>> } else {
>> @@ -899,8 +901,6 @@ struct kfd_process *kfd_create_process(struct
>> task_struct *thread)
>> init_waitqueue_head(&process->wait_irq_drain);
>> }
>> out:
>> - if (!IS_ERR(process))
>> - kref_get(&process->ref);
>> mutex_unlock(&kfd_processes_mutex);
>> mmput(thread->mm);
>> @@ -1191,7 +1191,12 @@ static struct mmu_notifier
>> *kfd_process_alloc_notifier(struct mm_struct *mm)
>> srcu_read_unlock(&kfd_processes_srcu, idx);
>> - return p ? &p->mmu_notifier : ERR_PTR(-ESRCH);
>> + if (p) {
>> + kref_get(&p->ref);
>
> This should be inside the srcu. I think you could use
> kfd_lookup_process_by_mm instead of open-coding the SRCU locking and
> find_process_by_mm. This does the lookup and reference counting safely
> already.
>
ok, understand. Will do it after next week vacation.
Thanks
Xiaogang
> Regards,
> Felix
>
>> + return &p->mmu_notifier;
>> + }
>> +
>> + return ERR_PTR(-ESRCH);
>> }
>> static void kfd_process_free_notifier(struct mmu_notifier *mn)
More information about the amd-gfx
mailing list