[PATCH v2] drm/amdkfd: change kfd process kref count at creation

Chen, Xiaogang xiaogang.chen at amd.com
Fri Oct 18 22:02:11 UTC 2024


On 10/18/2024 2:14 PM, Felix Kuehling wrote:
>
> On 2024-10-11 10:41, Xiaogang.Chen wrote:
>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>
>> kfd process kref count(process->ref) is initialized to 1 by 
>> kref_init. After
>> it is created not need to increaes its kref. Instad add kfd process 
>> kref at kfd
>> process mmu notifier allocation since we decrease the ref at 
>> free_notifier of
>> mmu_notifier_ops, so pair them.
>>
>> When user process opens kfd node multiple times the kfd process kref is
>> increased each time to balance kfd node close operation.
>>
>> Signed-off-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>
> Thanks, this is an elegant solution, IMO. The reference returned by 
> kfd_create_process comes either from find_process or create_process. 
> And the extra reference that gets released by the free_notifier gets 
> allocated by the alloc_notifier. I think there is a race condition, 
> though. See inline.
>
>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index d07acf1b2f93..78bf918abf92 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -850,8 +850,10 @@ struct kfd_process *kfd_create_process(struct 
>> task_struct *thread)
>>           goto out;
>>       }
>>   -    /* A prior open of /dev/kfd could have already created the 
>> process. */
>> -    process = find_process(thread, false);
>> +    /* A prior open of /dev/kfd could have already created the process.
>> +     * find_process will increase process kref in this case
>> +     */
>> +    process = find_process(thread, true);
>>       if (process) {
>>           pr_debug("Process already found\n");
>>       } else {
>> @@ -899,8 +901,6 @@ struct kfd_process *kfd_create_process(struct 
>> task_struct *thread)
>>           init_waitqueue_head(&process->wait_irq_drain);
>>       }
>>   out:
>> -    if (!IS_ERR(process))
>> -        kref_get(&process->ref);
>>       mutex_unlock(&kfd_processes_mutex);
>>       mmput(thread->mm);
>>   @@ -1191,7 +1191,12 @@ static struct mmu_notifier 
>> *kfd_process_alloc_notifier(struct mm_struct *mm)
>>         srcu_read_unlock(&kfd_processes_srcu, idx);
>>   -    return p ? &p->mmu_notifier : ERR_PTR(-ESRCH);
>> +    if (p) {
>> +        kref_get(&p->ref);
>
> This should be inside the srcu. I think you could use 
> kfd_lookup_process_by_mm instead of open-coding the SRCU locking and 
> find_process_by_mm. This does the lookup and reference counting safely 
> already.
>
ok, understand. Will do it after next week vacation.

Thanks

Xiaogang

> Regards,
>   Felix
>
>> +        return &p->mmu_notifier;
>> +    }
>> +
>> +    return ERR_PTR(-ESRCH);
>>   }
>>     static void kfd_process_free_notifier(struct mmu_notifier *mn)


More information about the amd-gfx mailing list