[PATCH] drm/amdkfd: change kfd process kref count at creation

Felix Kuehling felix.kuehling at amd.com
Wed Oct 9 21:45:56 UTC 2024


On 2024-10-09 17:02, Chen, Xiaogang wrote:
>
> On 10/9/2024 3:38 PM, Felix Kuehling wrote:
>> On 2024-10-09 14:08, Xiaogang.Chen wrote:
>>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>>
>>> kfd process kref count(process->ref) is initialized to 1 by 
>>> kref_init. After
>>> it is created not need to increaes its kref. Instad add kfd process 
>>> kref at kfd
>>> process mmu notifier allocation since we decrease the ref at 
>>> free_notifier of
>>> mmu_notifier_ops, so pair them.
>>
>> That's not correct. kfd_create_process returns a struct kfd_process * 
>> reference. That gets stored by the caller in filep->private_data. 
>> That requires incrementing the reference count. You can have multiple 
>> references to the same struct kfd_process if user mode opens /dev/kfd 
>> multiple times. The reference is released in kfd_release. Your change 
>> breaks that use case.
>>
> ok, if user mode open and close /dev/kfd multiple times(current Thunk 
> only allows user process open the kfd node once)  the change will 
> break this use case.
>> The reference released in kfd_process_free_notifier is only one per 
>> process and it's the reference created by kref_init.
>
> I think we can increase kref if find_process return true(the user 
> process already created kfd process). If find_process return false do 
> not increase kref since kref_init has been set to 1.
>
> Or change find_process(thread, false) to find_process(thread, true) 
> that will increase kref if it finds kfd process has been created.
>
> The idea is to pair kref update between alloc_notifier and 
> free_notifier of mmu_notifier_ops for same process(mm). That would 
> seem natural.

What's the problem you're trying to solve? Is it just a cosmetic issue? 
The MMU notifier is registered in create_process (not 
kfd_create_process). If you add a kref_get in kfd_process_alloc_notifier 
you need to kfd_unref_process somewhere in create_process. I don't think 
it's worth the trouble and only risks introducing more reference 
counting bugs.

Regards,
   Felix


>
> Regards
>
> Xiaogang
>
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Signed-off-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 8 +++++---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index d07acf1b2f93..7c5471d7d743 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -899,8 +899,6 @@ struct kfd_process *kfd_create_process(struct 
>>> task_struct *thread)
>>> init_waitqueue_head(&process->wait_irq_drain);
>>>       }
>>>   out:
>>> -    if (!IS_ERR(process))
>>> -        kref_get(&process->ref);
>>>       mutex_unlock(&kfd_processes_mutex);
>>>       mmput(thread->mm);
>>>   @@ -1191,7 +1189,11 @@ static struct mmu_notifier 
>>> *kfd_process_alloc_notifier(struct mm_struct *mm)
>>>         srcu_read_unlock(&kfd_processes_srcu, idx);
>>>   -    return p ? &p->mmu_notifier : ERR_PTR(-ESRCH);
>>> +    if (p) {
>>> +        kref_get(&p->ref);
>>> +        return &p->mmu_notifier;
>>> +    }
>>> +    return ERR_PTR(-ESRCH);
>>>   }
>>>     static void kfd_process_free_notifier(struct mmu_notifier *mn)


More information about the amd-gfx mailing list