[PATCH] drm/amdkfd: change kfd process kref count at creation

Felix Kuehling felix.kuehling at amd.com
Fri Oct 11 15:13:10 UTC 2024


On 2024-10-11 10:53, Chen, Xiaogang wrote:
> 
> On 10/10/2024 3:37 PM, Felix Kuehling wrote:
>> On 2024-10-10 16:19, Chen, Xiaogang wrote:
>>>
>>> On 10/10/2024 2:01 PM, Felix Kuehling wrote:
>>>>
>>>> On 2024-10-09 18:16, Chen, Xiaogang wrote:
>>>>>
>>>>> On 10/9/2024 4:45 PM, Felix Kuehling wrote:
>>>>>>
>>>>>> On 2024-10-09 17:02, Chen, Xiaogang wrote:
>>>>>>>
>>>>>>> On 10/9/2024 3:38 PM, Felix Kuehling wrote:
>>>>>>>> On 2024-10-09 14:08, Xiaogang.Chen wrote:
>>>>>>>>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>>>>>>>>
>>>>>>>>> kfd process kref count(process->ref) is initialized to 1 by kref_init. After
>>>>>>>>> it is created not need to increaes its kref. Instad add kfd process kref at kfd
>>>>>>>>> process mmu notifier allocation since we decrease the ref at free_notifier of
>>>>>>>>> mmu_notifier_ops, so pair them.
>>>>>>>>
>>>>>>>> That's not correct. kfd_create_process returns a struct kfd_process * reference. That gets stored by the caller in filep->private_data. That requires incrementing the reference count. You can have multiple references to the same struct kfd_process if user mode opens /dev/kfd multiple times. The reference is released in kfd_release. Your change breaks that use case.
>>>>>>>>
>>>>>>> ok, if user mode open and close /dev/kfd multiple times(current Thunk only allows user process open the kfd node once)  the change will break this use case.
>>>>>>>> The reference released in kfd_process_free_notifier is only one per process and it's the reference created by kref_init.
>>>>>>>
>>>>>>> I think we can increase kref if find_process return true(the user process already created kfd process). If find_process return false do not increase kref since kref_init has been set to 1.
>>>>>>>
>>>>>>> Or change find_process(thread, false) to find_process(thread, true) that will increase kref if it finds kfd process has been created.
>>>>>>>
>>>>>>> The idea is to pair kref update between alloc_notifier and free_notifier of mmu_notifier_ops for same process(mm). That would seem natural.
>>>>>>
>>>>>> What's the problem you're trying to solve? Is it just a cosmetic issue? The MMU notifier is registered in create_process (not kfd_create_process). If you add a kref_get in kfd_process_alloc_notifier you need to kfd_unref_process somewhere in create_process. I don't think it's worth the trouble and only risks introducing more reference counting bugs.
>>>>>
>>>>> It is for making code cleaner or natural to read. mmu_notifier_get is the last call at create_process. If mmu_notifier_get fail the process is freed: kfree(process). If create_process success kfd_create_process return that process anyway(after create_process kfd_create_process creates sys entries that not affect return created kfd process). The finally result is same that kref is 2: one for kfd process creation, one for mmu notifier allocation.
>>>>
>>>> Currently, when you call kfd_create_process for the first time, it returns with kref=2. One reference for the MMU notifier, and one for file->private_data.
>>>>
>>>> Subsequent invocations of kfd_create_process when the process already exists should increment the kref by one to track the additional reference put into the new file->private_data.
>>> one ways is changing find_process(thread, false) to find_process(thread, true) at kfd_create_process. When kfd process already exist find_process will call kref_get(&p->ref);
>>>>
>>>>
>>>> If you can come up with a patch that preserves this logic _and makes the code simpler and more readable_, I will consider approving it. Also keep in mind that your patch would need to be ported to the DKMS branch, where there are two different code paths to support older kernels that don't have mmu_notifier_get/put.
>>>>
>>> At DKMS branch alloc_notifier and free_notifer either exist together or both not exist. So when HAVE_MMU_NOTIFIER_PUT is defined(new kernel) it is ok.
>>>
>>> #ifdef HAVE_MMU_NOTIFIER_PUT
>>>         .alloc_notifier = kfd_process_alloc_notifier,
>>>         .free_notifier = kfd_process_free_notifier,
>>> #endif
>>>
>>> but when HAVE_MMU_NOTIFIER_PUT is not defined we need change kfd_process_destroy_delayed since since it call kfd_unref_process(p);
>>>
>>> static void kfd_process_destroy_delayed(struct rcu_head *rcu)
>>> {
>>>         struct kfd_process *p = container_of(rcu, struc mmu_notifier_registert kfd_process, rcu);
>>>
>>>         kfd_unref_process(p);
>>> }
>>>
>>> That means if port this patch to dkms branch when HAVE_MMU_NOTIFIER_PUT is not defined(old kernel) we do not need call kfd_process_destroy_delayed or remove mmu_notifier_call_srcu(&p->rcu, &kfd_process_destroy_delayed)  at kfd_process_notifier_release_internal. I think that make thing simpler for old kernel.
>>
>> No, we still need to destroy the kref that belongs to the process when the mm_struct is destroyed. We can't do that in kfd_process_notifier_release_internal because it leads to LOCKDEP issues. So we still need kfd_process_destroy_delayed.
> 
> For old kernel we use mmu_notifier_register that does not update kfd process kref. It register process->mmu_notifier to mmu. The release function of process->mmu_notifier got called when mm structure got destroyed.

We register the MMU notifier to handle cleanup when the process is destroyed. There must be a reference that gets decremented when that MMU notifier runs so that the kfd_process structure doesn't get destroyed too early.

> 
> The new kernel uses get/put flow. At kfd process creation we use mmu_notifier_get that calls alloc_notifier. So this patch increases kfd process kref at alloc_notifier since mmu refers kfd process and we already decrease the kref at free_notifier.

The notifier has the exact same purpose, no matter how it gets registered or allocated.

Currently the reference that is decremented by the MMU notifier is the reference that got initialized by kref_init. I propose to leave it that way for old and new kernels for simplicity.

If you decide to change it, you must ensure that you don't break the process reference counting. If you leak a process reference, we won't cleanup process resources and eventually run out of memory. If you decrement the last process reference too early, you may get use-after-free problems in the MMU notifier or change the semantics for processes that open and close KFD multiple times (e.g. KFDTest).

Regards,
  Felix

> 
> For old kernel, when port this patch to dkms branch we either do not decrease kfd process kref during mm destruction since we did not increase kfd process kref during creation, or  manually increase this kref after  mmu_notifier_register, then keep kfd_process_destroy_delayed.
> 
> Regards
> 
> Xiaogang
> 
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> So it needs additional handling for old kernel on dkms branch. I do not know who port patch to dkms branch, or I should change that on dkms branch.
>>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>>
>>>>>
>>>>> Regards
>>>>>
>>>>> Xiaogang
>>>>>
>>>>>> Regards,
>>>>>>   Felix
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Xiaogang
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>   Felix
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Signed-off-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 8 +++++---
>>>>>>>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>>>>> index d07acf1b2f93..7c5471d7d743 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>>>>> @@ -899,8 +899,6 @@ struct kfd_process *kfd_create_process(struct task_struct *thread)
>>>>>>>>> init_waitqueue_head(&process->wait_irq_drain);
>>>>>>>>>       }
>>>>>>>>>   out:
>>>>>>>>> -    if (!IS_ERR(process))
>>>>>>>>> -        kref_get(&process->ref);
>>>>>>>>>       mutex_unlock(&kfd_processes_mutex);
>>>>>>>>>       mmput(thread->mm);
>>>>>>>>>   @@ -1191,7 +1189,11 @@ static struct mmu_notifier *kfd_process_alloc_notifier(struct mm_struct *mm)
>>>>>>>>>         srcu_read_unlock(&kfd_processes_srcu, idx);
>>>>>>>>>   -    return p ? &p->mmu_notifier : ERR_PTR(-ESRCH);
>>>>>>>>> +    if (p) {
>>>>>>>>> +        kref_get(&p->ref);
>>>>>>>>> +        return &p->mmu_notifier;
>>>>>>>>> +    }
>>>>>>>>> +    return ERR_PTR(-ESRCH);
>>>>>>>>>   }
>>>>>>>>>     static void kfd_process_free_notifier(struct mmu_notifier *mn)


More information about the amd-gfx mailing list