[PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.
Felix Kuehling
felix.kuehling at amd.com
Wed Mar 8 16:38:07 UTC 2023
On 2023-03-08 11:20, Christian König wrote:
> Am 08.03.23 um 17:17 schrieb Felix Kuehling:
>> On 2023-03-08 04:07, Christian König wrote:
>>> Am 07.03.23 um 16:28 schrieb Belanger, David:
>>>> [AMD Official Use Only - General]
>>>>
>>>>
>>>> The test case is a python program that will load the driver, do
>>>> some operations, then unload the driver.
>>>
>>> What do you mean with unloading the driver? Removing the module? Or
>>> destroying the device?
>>>
>>>> When the driver exists, there is still the python process space
>>>> around holding on the address space.
>>>> When the python process space exits, the mmu_notifier gets called
>>>> but the driver has already been unloaded.
>>>>
>>>> The goal of the fix is to address case where there could be
>>>> outstanding address space / worker threads for process
>>>> cleanup that needs to be cleared/completed at exit time.
>>>
>>> Yeah and when the module is unloaded this is a completely futile
>>> effort.
>>>
>>> The general upstream approach is to take references on the struct
>>> device and module and prevent unloading as long as those references
>>> exists.
>>
>> That's not how it always works. In case of RCU callbacks, the
>> documented strategy is to use rcu_barrier in the module exit function
>> to ensure the grace period and all callbacks have completed
>> (https://www.kernel.org/doc/html/latest/RCU/rcubarrier.html).
>> mmu_notifier_synchronize is meant to do something similar for pending
>> mmu_notifier_put work
>> (https://elixir.bootlin.com/linux/v6.2.2/source/mm/mmu_notifier.c#L1116).
>>
>> But this implies that we need to call mmu_notifier_put for all the
>> MMU notifiers registered by the module first. I think closing
>> /dev/kfd drops the module reference count, but the MMU notifiers we
>> register for process cleanup persist until the address space is
>> destroyed. We need to trigger that cleanup for any processes that
>> still exist in that state when the module is unloaded. Or we need to
>> find a way to increment the module refcount for every process that
>> registers a KFD cleanup MMU notifier.
>
> The later is what I've meant. Cleaning up when the module unloads is
> certainly possible as well, but harder to get right.
I think we can get the cleanup right. I suggested a strategy to David in
my code review.
>
> And I don't really see an use case that we should do the cleanup way.
I'm not sure it's a question of use cases. I see it more as a risk
trade-off. If we manually add module refcounts for our cleanup notifiers
(try_module_get(THIS_MODULE)/module_put(THIS_MODULE)), there is a risk
of leaks that could prevent module unloading, or underflows that could
allow the module to be unloaded too early.
I guess this particular test (app trying to unload the module after
using KFD) would just fail if we add module refcounts. But I agree that
this is not a valid usecase.
Regards,
Felix
>
> Regards,
> Christian.
>
>>
>> Regards,
>> Felix
>>
>>
>>>
>>>
>>> The device might be non-functional any more (because for example of
>>> hot plug), but the driver should never be unloaded before the python
>>> program exits.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Regards,
>>>> David B.
>>>>
>>>>> -----Original Message-----
>>>>> From: Koenig, Christian <Christian.Koenig at amd.com>
>>>>> Sent: Tuesday, March 7, 2023 2:05 AM
>>>>> To: Belanger, David <David.Belanger at amd.com>; amd-
>>>>> gfx at lists.freedesktop.org
>>>>> Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
>>>>> exit.
>>>>>
>>>>> Am 06.03.23 um 22:58 schrieb David Belanger:
>>>>>> Handle case when module is unloaded (kfd_exit) before a process
>>>>>> space
>>>>>> (mm_struct) is released.
>>>>> Well that should never ever happen in the first place. It sounds
>>>>> like we are
>>>>> missing grabbing module references.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: David Belanger <david.belanger at amd.com>
>>>>>> ---
>>>>>> drivers/gpu/drm/amd/amdkfd/kfd_module.c | 4 ++
>>>>>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57
>>>>> ++++++++++++++++++++++++
>>>>>> 2 files changed, 61 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> index 09b966dc3768..8ef4bd9e4f7d 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> @@ -26,6 +26,9 @@
>>>>>> #include "kfd_priv.h"
>>>>>> #include "amdgpu_amdkfd.h"
>>>>>>
>>>>>> +void kfd_cleanup_processes(void);
>>>>>> +
>>>>>> +
>>>>>> static int kfd_init(void)
>>>>>> {
>>>>>> int err;
>>>>>> @@ -77,6 +80,7 @@ static int kfd_init(void)
>>>>>>
>>>>>> static void kfd_exit(void)
>>>>>> {
>>>>>> + kfd_cleanup_processes();
>>>>>> kfd_debugfs_fini();
>>>>>> kfd_process_destroy_wq();
>>>>>> kfd_procfs_shutdown();
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>> index ebabe92f7edb..b5b28a32639d 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>>>> @@ -1181,6 +1181,17 @@ static void
>>>>>> kfd_process_notifier_release(struct
>>>>> mmu_notifier *mn,
>>>>>> return;
>>>>>>
>>>>>> mutex_lock(&kfd_processes_mutex);
>>>>>> + /*
>>>>>> + * Do early return if p is not in the table.
>>>>>> + *
>>>>>> + * This could potentially happen if this function is called
>>>>>> concurrently
>>>>>> + * by mmu_notifier and by kfd_cleanup_pocesses.
>>>>>> + *
>>>>>> + */
>>>>>> + if (!hash_hashed(&p->kfd_processes)) {
>>>>>> + mutex_unlock(&kfd_processes_mutex);
>>>>>> + return;
>>>>>> + }
>>>>>> hash_del_rcu(&p->kfd_processes);
>>>>>> mutex_unlock(&kfd_processes_mutex);
>>>>>> synchronize_srcu(&kfd_processes_srcu);
>>>>>> @@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops
>>>>> kfd_process_mmu_notifier_ops = {
>>>>>> .free_notifier = kfd_process_free_notifier,
>>>>>> };
>>>>>>
>>>>>> +
>>>>>> +void kfd_cleanup_processes(void)
>>>>>> +{
>>>>>> + struct kfd_process *p;
>>>>>> + unsigned int temp;
>>>>>> +
>>>>>> + /*
>>>>>> + * Iterate over remaining processes in table, calling
>>>>>> notifier release
>>>>>> + * to free up notifier and process resources.
>>>>>> + *
>>>>>> + * This code handles the case when driver is unloaded before
>>>>>> all
>>>>> mm_struct
>>>>>> + * are released.
>>>>>> + */
>>>>>> + int idx = srcu_read_lock(&kfd_processes_srcu);
>>>>>> +
>>>>>> + hash_for_each_rcu(kfd_processes_table, temp, p,
>>>>>> kfd_processes) {
>>>>>> + if (p) {
>>>>>> + /*
>>>>>> + * Obtain a reference on p to avoid a late
>>>>> mmu_notifier release
>>>>>> + * call triggering freeing the process.
>>>>>> + */
>>>>>> +
>>>>>> + kref_get(&p->ref);
>>>>>> +
>>>>>> + srcu_read_unlock(&kfd_processes_srcu, idx);
>>>>>> +
>>>>>> + kfd_process_notifier_release(&p->mmu_notifier, p-
>>>>>> mm);
>>>>>> +
>>>>>> + kfd_unref_process(p);
>>>>>> +
>>>>>> + idx = srcu_read_lock(&kfd_processes_srcu);
>>>>>> + }
>>>>>> + }
>>>>>> + srcu_read_unlock(&kfd_processes_srcu, idx);
>>>>>> +
>>>>>> + /*
>>>>>> + * Must be called after all mmu_notifier_put are done and
>>>>>> before
>>>>>> + * kfd_process_wq is released.
>>>>>> + *
>>>>>> + * Ensures that all outstanding free_notifier gets called,
>>>>>> triggering the
>>>>> release
>>>>>> + * of the process.
>>>>>> + */
>>>>>> + mmu_notifier_synchronize();
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>> static int kfd_process_init_cwsr_apu(struct kfd_process *p,
>>>>>> struct file
>>>>> *filep)
>>>>>> {
>>>>>> unsigned long offset;
>>>
>
More information about the amd-gfx
mailing list